Hacker News new | past | comments | ask | show | jobs | submit login

I don't know if this is a breakthrough but... Frankly this looks awful.

I think I could handle unrealistic colors but the way they flicker so much frame to frame is really jarring.

It's interesting that the algorithm seems to generate chromatic aberration at hard edges? Most clearly around the letters on the title cards.

The de-noising and super-resolution looks quite good to me. It's the colorization that's super unstable and looks ugly. If they'd just left it B&W it would IMHO be more impressive.

I kept waiting for it to get better than a light brown haze with flashes of green... but it never did. The fire looked okay, but even that looked weird at the beginning of the sequence.

What really made me laugh was the blue smoke followed by a headline implying that was all at night. The AI filled in pristine blue skies and inverted the wreckage colors. It actually made up history. I'm calling it: Unintentional automated disinformation.

The original video was interlaced. The deinterlacing part of the algorithm is not very good, and I think they would have gotten better results with a special-purpose pass before handing off to the neural network.

Or using a better source video. As there's no way the original film was interlaced.

I'm now imagining a low-paid 1930s film job called "interlacer" which consisted of taking every frame and drawing tiny, perfectly straight black lines on it, with a tiny fineliner and a tiny ruler, on odd and even rows, every next frame.

I know of a company that did the opposite to handle 2:3 pulldown for rotoscoping. They had a photoshop action that would select every other line, copy&paste to a new doc, collapse the empty space, and then paint the frame. Then, do that a second time for the other half. Finally, more actions to recombine.

I couldn't make this up. My jaw just hit the floor when it was explained to me the first time. I still shake my head typing it up to post here.

I had to look this up: https://en.m.wikipedia.org/wiki/Three-two_pull_down

What tool should have been used to do that conversion (in reverse)? To their credit, it sounds like they solved the problem and moved on.

They solved the problem by increasing the work unnecessarily. So to explain the issue with more detail and not requiring peeps to jump to wikipedia, 2:3 pulldown is how 24fps film was converted to 29.97 fps interlaced video. This increases the number of frames roughly 20%, but it does this by repeating fields of the original progressive data. When you step through this 29.97 video frame-by-frame, you will see a repeating pattern of 3 progressive frames followed by 2 interlaced frames. To do this asinine made up workflow, they then increased the number of frames again. Instead, they should have done an inverse telecine (IVTC), which reverses the 2:3 pulldown returning 24 fps progressive frames. When rotoscoping, you definitely do not want to be creating additional frames "just because". Applying a small amount of logic should have suggested that these interlaced frames are odd since the orgininal source had nothing but clean progressive frames. If this was introduced in the conversion to video, surely it can be undone.

They, as you, decided it was acceptable to do whatever needed to be done at whatever expense rather than taking 10 minutes to find the proper workflow which would have saved them money. A simple phone call to the company they used to transfer the film to video would have been able to explain to them how to do this in less than 5 minutes. I know because I worked for the company doing the film transfer and had helped several other clients with the video for film post workflows. Today, it's even easier because there's like a bazillion write ups on how to do this posted on the web.

Edit: I didn't answer the question directly after pontificating. IVTC is the process that needed to be applied. Many many tools exist(ed) for this. The tools at this company's disposal would have been After Effects, Avid, etc. Later more tools became available like AVISynth, FFMPEG, and other dedicated tools were created by people to tackle this directly.

A flow like this could have been someone's job security.

Analogue video (e.g. television) interlacing did actually begin in the 1930s, a few years before the Hindenburg disaster. But the original disaster footage would have been shot on film (not interlaced of course), cut into newsreels and then converted to a television signal via telecine.

Supposedly the original newsreals still exist, preserved by the National Film Registry: https://en.wikipedia.org/wiki/Hindenburg_disaster_newsreel_f...

Original film may have flickered, or the playback process to capture may have induced flicker.

What is missing is object continuity with respect to color. That would quiet things down tremendously. Right now it is as if every object gets re-painted from one frame to the next in a completely new (and often garish or wildly incorrect) color.

I believe there already is quite some continuity, otherwise the colors would flicker much stronger from frame to frame. In the video it varies smoothly from frame to frame.

There are many instances of frame-to-frame discontinuity that I can't explain other than by a lack of object detection and labeling. It would be less wrong to use the color from the previous frame even if the lighting changes than to use an entirely different hue for the same object.

Only things like TV screens and other displays (and some interesting objects covered with micro surfaces that can cause light interference) can change color that rapidly given the same color incident light.

I think they get that not because of putting in that constraint, but only because subsequent frames are similar. That makes the coloring algorithm pick similar coloring.

Looks like the AI can handle objects it "knows" (people, grass, sky etc.) okish, but is completely confounded by the Zeppelin - which is sad, as the Hindenburg is of course present in most of the shots. Maybe they should have trained it with some color(ized) photos of the Hindenburg (like this one: https://www.alamy.de/das-deutsche-zeppelin-luftschiff-die-hi...) first?

This is where a human videographer could trivially do better or add value. The obsession with "automating everything" is a real disease in ML and CS generally. Sometimes it makes things worse! This is one of those situations.

Only thing that looked good was the closeup of the captain. That one short segment was very well done.

For an early alpha version, it is ok-ish. For anything but that, it is terrible.

I had to stop 60% of the way through. It was giving me a headache.

This video proves to me that you can’t do denoise, upscale and then color without stabilizing first. The result is too jarring once image stab is a known transform.

There are also certain frames where the smoke turns into green trees

People complaining about the colors shifting don't release it pulsated red in time to the phat beats it was dropping.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact