Video encoders are notorious for their bloated claims and faulty comparisons. After a quick perusal of their site I see they compare themselves only to web-m, and use generalized %'s to claim differences between products and processes.
They say they've optimized for human perceptual quality. Perhaps they have. But did they really do a better job than the open source H.264 alternative (x264) used by everyone from YouTube and Hulu, to Facebook? One thats been in development for _years_ now and beats just about everything thrown at it in a fair comparison?
I think not. I'd love to be wrong though.
When they do eventually come out with comparisons, I hope they have read this first : http://x264dev.multimedia.cx/archives/472
videomini ver. 90.15 options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=umh subme=8 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=10 qpmax=51 qpstep=4 ip_ratio=1.41 aq=1:1.00
videomini ver. 89.15 options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=umh subme=8 psy=0 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc=crf mbtree=0 crf=23.0 qcomp=0.60 qpmin=10 qpmax=51 qpstep=4 ip_ratio=1.40 aq=1:1.00
So their encoder is based on x264 (without credit). The settings are slightly different between the clips, so maybe they have some magic thing to detect which settings to use. In any case, it'd be interesting to reencode the original clip using plain x264, once using the same options they used and once using some sane x264 preset.
I'm up for that. It'll take a while to download all the source videos, but I'll report back with results and videos when I'm done.
I already did a test encode with x264 --preset veryslow --tune film --level 4.0 --crf 21 on the Bourne Identity clip, and the result is within the same realm than the Beamr video with ~58% the filesize. Which is not surprising at all, considering Beamr's settings.
EDIT: Testing with nearly identical settings (x264 --bframes 0 -- subme 8 --no-psy --no-mbtree) using the latest x264 gives me about 40% the bitrate of the Beamr example for the Bourne Identity clip, which unsurprisingly looks notably worse. So maybe Beamr's "magical" encoding technology is all about fiddling with x264's CRF to make it double the bitrate at a given value compared to vanilla x264?
I'm definitely seeing a loss in quality on the "mini" encodes relative to the source -- mostly in loss of grain, and fine details (in hair, etc).
However, if they're analyzing the motion within the video and using perceptual algorithms to determine what the focal points are, etc. then it's totally fair to throw away details in sections that are peripheral, and likewise getting rid of grain in a blurred-out portion of the screen that's panning makes sense if your eye wouldn't perceive that anyway. This seems to be a much more aggressive version of x264's "psy" optimizations, essentially.
So the resulting video might look very similar in quality to someone watching the movie, but not to someone analyzing the frames and details. There's a hard line to draw on how that's marketed -- it's not actually "without losing quality", but it might be "without looking worse"?
I'm going to do two sets of comparisons for each clip: Beamr's video compared to x264 with similar settings and bitrate, and then Beamr-like settings and high quality settings compared at a much more realistic bitrate that you would actually see in use in digital video on the internet.
Just to note, no credit is necessarily required. x264 is now dual-licensed under GPL2 + commercial license. If they bought a commercial license, there's nothing illegal or wrong in what they do.
Looking into it, they seem to have actually done that. ICVT, the company behind Beamr, can be found on the list of x264 commercial adopters.
Doesn't change the fact that they're major Snake Oil Salesmen, though. Also, the claims on their technology page are still incredibly dubious.
Any encoder manufacturer can find some great looking and easy to encode images that can be massively reduced in data rate without much visible impact. However in tougher scenarios they will be unlikely to beat the existing market leaders by more than 10% or so under similar constraints (latentcy/CPU/multi-pass possible) under the review of experts. Tough scenarios will be when there is lots of noise and non-uniform movement. Pop concerts can be very challenging with crowd movements, flashing lights and low general light levels leading to noise.
Their first claimed success was with JPEGmini — this is a lot more credible since JPEG is a pretty obsolete format that simply continues because it's so well supported. JPEG2000 delivers the purported benefits of JPEGmini for real -- it's just poorly supported.
I suspect all they're really doing is fine-tuning compression settings based on differencing input and output using a decent perceptual algorithm.
I uploaded a photo straight out of a Nikon V1 onto JPEGmini and got a 2.2x reduction. I could do better than that by dialing down quality until I saw a difference. JPEG fine is way overkill for most purposes, and that's their market.
They don't seem to have done anything new or substantial by themselves, and as such, this thing is basically nothing but pure marketing bullshit. If you want to do high quality H.264 encodes, just use x264 directly.
Then I took the high quality image and started bumping up standard JPEG compression level until I reduced the file size to 1.3M. That mapped to 17/100 setting, with 1 being highest quality, 100 - highest compression).
Then I made an arithmetic per-pixel diff between the original high quality image and two 1.3M versions. Cropped them to 2000 x 2000 and saved as PNGs (cropping is to fit the imgur file size limits). Here they are -
Now, have a look and tell me that JPEGMini hasn't got something exciting going on.
-rw-r--r-- 1 tim users 4930482 Dec 1 21:22 hipnshoot.jpg
-rw-r--r-- 1 tim users 1310430 Feb 26 08:27 hipnshoot_gimp.jpg
-rw-r--r-- 1 tim users 1352882 Dec 3 09:16 hipnshoot_mini.jpg
I put them here: http://www.rareventure.com/jpegmini/
"hipnshoot_mini.jpg" is the "jpegmini" version and "hipnshoot_gimp.jpg" is the one I compressed with plain old gimp.
JPEGMini diff shows less color fringing (fewer "colored" pixels) and it also appears to better preserve edges and outlines.
That's just a marketing hype decsription of how all lossy compression works.
>Elements of the recordings that are easily perceived are represented with exacting precision, while other parts that are not very audible can be represented less accurately. Meanwhile, inaudible information can be discarded altogether.
The way you encode lossy formats makes a lot of difference. For example the same x264 encoder gives dramatically worse results when aiming for optimal PSNR rather than using its psychovisual optimizer:
Either they've done something magical and can outcompress x264 to that degree (i.e. in the ballpark of HEVC, yet without breaking compatability), or they're comparing against a terrible H.264 encoder and they've just wrapped x264 in a proprietary shell of BS (and/or they're liars?).
I say this because claims like this tend to be very source-dependent. Change the source, change the claim.
However, this also means that if you want better encoding, you can just grab x264 and do it yourself.
Yes, but it also means they're not entitled to call it a "patent pending" method.
Apart from the strong PR claims, this is actually not that bad, no? If you have two lossy compressions, but one is perceptually better, you will rather use the perceptually better one, no?
x264 is awesome. super awesome, even.
But when you do, yeah, I understand why size drops is that big.