Hacker News new | comments | show | ask | jobs | submit login
New Solution Can Cut Video File Size By Half – Without Losing Quality (nocamels.com)
82 points by jackau 1606 days ago | hide | past | web | 41 comments | favorite

Especially since this states H.264 compliance, I'd be willing to say this can't possibly live up to it's claims.

Video encoders are notorious for their bloated claims and faulty comparisons. After a quick perusal of their site I see they compare themselves only to web-m, and use generalized %'s to claim differences between products and processes.

They say they've optimized for human perceptual quality. Perhaps they have. But did they really do a better job than the open source H.264 alternative (x264) used by everyone from YouTube and Hulu, to Facebook? One thats been in development for _years_ now and beats just about everything thrown at it in a fair comparison? I think not. I'd love to be wrong though.

When they do eventually come out with comparisons, I hope they have read this first : http://x264dev.multimedia.cx/archives/472

From beamrvideo.com, Clip1_mini.mov contains:

  videomini ver. 90.15 options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=umh subme=8 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=10 qpmax=51 qpstep=4 ip_ratio=1.41 aq=1:1.00
And Clip2_mini.mov contains:

  videomini ver. 89.15 options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=umh subme=8 psy=0 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc=crf mbtree=0 crf=23.0 qcomp=0.60 qpmin=10 qpmax=51 qpstep=4 ip_ratio=1.40 aq=1:1.00
For comparaison, here's what a clip encoded with x264 contains: x264 - core 125 r2200 999b753 - H.264/MPEG-4 AVC codec - Copyleft 2003-2012 - http://www.videolan.org/x264.html - options: cabac=1 ref=1 deblock=1:0:0 analyse=0x1:0x111 me=hex subme=5 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=2 keyint=240 keyint_min=23 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=2pass mbtree=1 bitrate=373 ratetol=1.0 qcomp=0.60 qpmin=10 qpmax=51 qpstep=4 cplxblur=20.0 qblur=0.5 ip_ratio=1.40 aq=1:1.00

So their encoder is based on x264 (without credit). The settings are slightly different between the clips, so maybe they have some magic thing to detect which settings to use. In any case, it'd be interesting to reencode the original clip using plain x264, once using the same options they used and once using some sane x264 preset.

>In any case, it'd be interesting to reencode the original clip using plain x264, once using the same options they used and once using some sane x264 preset.

I'm up for that. It'll take a while to download all the source videos, but I'll report back with results and videos when I'm done.

I already did a test encode with x264 --preset veryslow --tune film --level 4.0 --crf 21 on the Bourne Identity clip, and the result is within the same realm than the Beamr video with ~58% the filesize. Which is not surprising at all, considering Beamr's settings.

EDIT: Testing with nearly identical settings (x264 --bframes 0 -- subme 8 --no-psy --no-mbtree) using the latest x264 gives me about 40% the bitrate of the Beamr example for the Bourne Identity clip, which unsurprisingly looks notably worse. So maybe Beamr's "magical" encoding technology is all about fiddling with x264's CRF to make it double the bitrate at a given value compared to vanilla x264?

Justin from Zencoder here... I've been doing a bit of similar testing and seeing similar results. So they're obviously controlling the settings per frame (or per segment) during the encode, meaning the values in that encoder line are not very useful in analyzing what they're doing.

I'm definitely seeing a loss in quality on the "mini" encodes relative to the source -- mostly in loss of grain, and fine details (in hair, etc).

However, if they're analyzing the motion within the video and using perceptual algorithms to determine what the focal points are, etc. then it's totally fair to throw away details in sections that are peripheral, and likewise getting rid of grain in a blurred-out portion of the screen that's panning makes sense if your eye wouldn't perceive that anyway. This seems to be a much more aggressive version of x264's "psy" optimizations, essentially.

So the resulting video might look very similar in quality to someone watching the movie, but not to someone analyzing the frames and details. There's a hard line to draw on how that's marketed -- it's not actually "without losing quality", but it might be "without looking worse"?

Based on the comparing I've done on the second clip so far, they seem to be doing absolutely nothing special - at approximately the same bitrate and settings, the videos are practically identical quality-wise (in fact, the most recent x264 seems to fare a small bit better).

I'm going to do two sets of comparisons for each clip: Beamr's video compared to x264 with similar settings and bitrate, and then Beamr-like settings and high quality settings compared at a much more realistic bitrate that you would actually see in use in digital video on the internet.

I just came back to add that, encoding 2-pass to the same bitrate that theirs results at, I have a very hard time telling the difference. I look forward to your comparison sets.

I've finished my comparisons now. HN submission here: http://news.ycombinator.com/item?id=5289532

> (without credit)

Just to note, no credit is necessarily required. x264 is now dual-licensed under GPL2 + commercial license. If they bought a commercial license, there's nothing illegal or wrong in what they do.

>If they bought a commercial license, there's nothing illegal or wrong in what they do.

Looking into it, they seem to have actually done that. ICVT, the company behind Beamr, can be found on the list of x264 commercial adopters[1].

Doesn't change the fact that they're major Snake Oil Salesmen, though. Also, the claims on their technology page[2] are still incredibly dubious.

[1] http://x264licensing.com/adopters

[2] http://beamrvideo.com/main/technology

There would be nothing wrong apart from claiming technology as their own if they did this under the GPL2+ as well.

Incremental improvements in H.264 are still possible but not a halving of bitrate in the general case. The required rate depends very much on the content of the video and the size at which it will be viewed.

Any encoder manufacturer can find some great looking and easy to encode images that can be massively reduced in data rate without much visible impact. However in tougher scenarios they will be unlikely to beat the existing market leaders by more than 10% or so under similar constraints (latentcy/CPU/multi-pass possible) under the review of experts. Tough scenarios will be when there is lots of noise and non-uniform movement. Pop concerts can be very challenging with crowd movements, flashing lights and low general light levels leading to noise.

The more detailed claim is 5-6Mbps streams can be reduced to 3-4, so it's more like a 25% improvement methinks.

Their first claimed success was with JPEGmini — this is a lot more credible since JPEG is a pretty obsolete format that simply continues because it's so well supported. JPEG2000 delivers the purported benefits of JPEGmini for real -- it's just poorly supported.

I suspect all they're really doing is fine-tuning compression settings based on differencing input and output using a decent perceptual algorithm.

I uploaded a photo straight out of a Nikon V1 onto JPEGmini and got a 2.2x reduction. I could do better than that by dialing down quality until I saw a difference. JPEG fine is way overkill for most purposes, and that's their market.

As others have noted, Beamr's "magical" technology simply seems to be using x264, the state-of-the-art in H.264 encoders, which also happens to be free and open source software. Looking at their site, beamrvideo.com, the bitrates they list for the originals are what you'll typically find on Blu-ray discs, not on online streams. Anyone who knows a thing or two about video encoding and x264 will know that you can re-encode these to a much smaller size while retaining most of the visual quality. As such, claiming that they can reduce the bitrate to half with no notable loss in quality is like claiming that you can reduce the bit depth and sampling rate of a 24-bit 192 kHz audio file to 16-bit 48 kHz with no notable loss in quality - it's certainly true, but you hardly needed the 24-bit 192 kHz quality to begin with.

They don't seem to have done anything new or substantial by themselves, and as such, this thing is basically nothing but pure marketing bullshit. If you want to do high quality H.264 encodes, just use x264 directly.

For the lazy - http://jpegmini.com

I grabbed 4th pair of their demo image [0], the high quality image is about 4.8M, the JpegMini'd version - 1.3M.

Then I took the high quality image and started bumping up standard JPEG compression level until I reduced the file size to 1.3M. That mapped to 17/100 setting, with 1 being highest quality, 100 - highest compression).

Then I made an arithmetic per-pixel diff between the original high quality image and two 1.3M versions. Cropped them to 2000 x 2000 and saved as PNGs (cropping is to fit the imgur file size limits). Here they are -



Now, have a look and tell me that JPEGMini hasn't got something exciting going on.


[0] http://media.jpegmini.com/homepageImages/20121203/hipnshoot....

Haha, I only kind of skimmed your post then downloaded that file. I opened the images and was like "I don't know how high-quality this is. The images look exactly the same to me." And then I saw the file sizes and had the ah-ha moment.

Maybe I'm not the most knowledgeable guy here, but I really don't see what you're getting at. The per-pixel diffs look nearly the same. I downloaded the "hipnshoot.zip" and yes the photos look identical. But then I fired up gimp and re-compressed the high quality photo to reduce the size even more than the "jpegmini" version, and I can't tell the difference. Here are the file sizes:

-rw-r--r-- 1 tim users 4930482 Dec 1 21:22 hipnshoot.jpg

-rw-r--r-- 1 tim users 1310430 Feb 26 08:27 hipnshoot_gimp.jpg

-rw-r--r-- 1 tim users 1352882 Dec 3 09:16 hipnshoot_mini.jpg

I put them here: http://www.rareventure.com/jpegmini/

"hipnshoot_mini.jpg" is the "jpegmini" version and "hipnshoot_gimp.jpg" is the one I compressed with plain old gimp.

I don't see much of a difference. Is that the point? Are you saying that JPEGMini are just compressing to ~83% quality?

No, there is a noticeable difference.

JPEGMini diff shows less color fringing (fewer "colored" pixels) and it also appears to better preserve edges and outlines.

I'm not sure how much it could change anything here, but Imgur also compresses images when uploaded. Maybe try min.us?

PNG is lossless.

JPEGmini better preserves color information, but it doesn't seem to be that much better.

>the compression method mimics the human eye and removes elements that would not have been processed by the human eye in the first place.

That's just a marketing hype decsription of how all lossy compression works.

I wrote a JPEG encoder once, and if I remember correctly you can change the quantization tables and the Huffman tables. My encoder just used the default tables in the standard, but it wouldn't surprise me to optimize these using a psychovisual fitness function.

Yes, but the default tables in the standard do mimic the human eye's limitations. That's what increasing the quant factor on higher frequencies is.

Agreed, reminds me of mp3.

>Elements of the recordings that are easily perceived are represented with exacting precision, while other parts that are not very audible can be represented less accurately. Meanwhile, inaudible information can be discarded altogether.

Fraunhofer Institute

I'm a bit sceptical of such glorious claims from a company that apparently just made a UI for a libjpeg.

AFAIK JPEGmini has much better psychovisual optimization than libjpeg, so I think their claim that they can compress JPEG better within limits of the format is fair.

The way you encode lossy formats makes a lot of difference. For example the same x264 encoder gives dramatically worse results when aiming for optimal PSNR rather than using its psychovisual optimizer:




Is it not a bit ironic to be linking to x264 developers to support the thesis that a new H.264 encoder can be twice as effective in compressing H.264 than the competition?

Either they've done something magical and can outcompress x264 to that degree (i.e. in the ballpark of HEVC, yet without breaking compatability), or they're comparing against a terrible H.264 encoder and they've just wrapped x264 in a proprietary shell of BS (and/or they're liars?).

It's undoubtedly bullshit.

Let them put up a long, complex video -- an action movie, for example -- using H.264, and their own method, for side-by-side comparison, both with regard to quality and file size. Let them offer something better than hand-waving press releases.

I say this because claims like this tend to be very source-dependent. Change the source, change the claim.

They appear to be using modern x264. This does outperform most (all?) commercial H.264 encoders to the degree they claim, so their claims can have merit.

However, this also means that if you want better encoding, you can just grab x264 and do it yourself.

> They appear to be using modern x264. This does outperform most (all?) commercial H.264 encoders to the degree they claim, so their claims can have merit.

Yes, but it also means they're not entitled to call it a "patent pending" method.

They use x264 on near-default settings.

It is true that it's the same company, however, the link you're referring to is about image compression and not video compression.

So, in the case of jpegs, their compression is not lossless, they just claim the losses are not that perceptually visible.

Apart from the strong PR claims, this is actually not that bad, no? If you have two lossy compressions, but one is perceptually better, you will rather use the perceptually better one, no?

I see this in the same way that apple pitches the retina display (where supposedly you wouldn't be able to more finely resolve pixels from some standard distance).

Just using x264 with veryslow and/or two pass setting usually cuts file sizes by half compared to e.g. an IP or phone camera output. They could have some magic there, but you can get similar results at home without having to pay anyone. You just have to read a little about how to run x264.

x264 is awesome. super awesome, even.

So, since it's seem to be fairly well established that they're just using x264, anyone got any idea how they plan on making money from this, er, product/service they're offering?

What encoder did they compare it with?

Of course, until you compare 1by1 pixel, you see no difference.

But when you do, yeah, I understand why size drops is that big.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact