
Running FFmpeg on AWS Lambda for 1.9% the Cost of AWS Elastic Transcoder - foob
https://intoli.com/blog/transcoding-on-aws-lambda/
======
mncharity
Similarly, you can _compile_ ffmpeg on Lambda, in 0.5 minutes, for 9 cents.[1]
Versus 10 min on one core, for ~free. And while -j200 of ffmpeg is nice,
-j1000 of the linux kernel is... wow, like seeing the future.

[1] demo in a talk:
[https://www.youtube.com/watch?v=O9qqSZAny3I&t=55m15s](https://www.youtube.com/watch?v=O9qqSZAny3I&t=55m15s)
(the actual run (sans uploading) is at
[https://www.youtube.com/watch?v=O9qqSZAny3I&t=1h2m58s](https://www.youtube.com/watch?v=O9qqSZAny3I&t=1h2m58s)
); code:
[https://github.com/StanfordSNR/gg](https://github.com/StanfordSNR/gg) ; some
slides (page 24):
[http://www.serverlesscomputing.org/wosc2/presentations/s2-wo...](http://www.serverlesscomputing.org/wosc2/presentations/s2-wosc-
slides.pdf)

~~~
adtac
This is impossibly amazing to me! Thank you so much, what an excellent
lecture.

------
benmanns
Very cool. I hope more of the "value add" stuff from cloud providers ($$$) can
be replaced with open source running on their cloud functions. My suggestions:

* FFmpeg supports http/https as input protocols if compiled with the options enabled. See `ffmpeg -protocols`

* You can parallelize or chunk FFmpeg to enable longer inputs, e.g. I found [https://github.com/nergdron/dve/blob/9f1ca516b18f50d1d99d15e...](https://github.com/nergdron/dve/blob/9f1ca516b18f50d1d99d15e1fa70879cb30576cd/dve)

* Try with larger memory sizes. Larger memory = more CPU for Lambda which may result in shorter transcodes. You might even pay the same amount if the transcodes are CPU bound and finish in roughly linear time wrt CPUs

~~~
akvadrako
This is not "cool" \- this is either doing less than Amazon's encoding service
or just exposing the pricing model. If it's cheaper they could charge less,
unless they are terrible programmers who can't even use their own lambda
functions.

~~~
kwillets
>If it's cheaper they could charge less,

You definitely misunderstand cloud pricing.

~~~
akvadrako
How so? It seems to me this is a perfect example of it.

~~~
thefounder
They have no interest to charge you less. The strategy is to lock you in and
get the most they can from you.

~~~
akvadrako
Of course - that's why this "exposes their pricing model", which is charging
for services.

------
kwillets
Next installment: "Running FFmpeg on a tower under my desk for 1% the cost of
AWS Lambda".

Is there any effort to on-prem lambda stuff yet? I know it's a moving target,
but I wouldn't recommend getting into cloud stuff you can't migrate out of.

~~~
hangtwenty
Yes! If one uses something like Kubeless[1], you have something like AWS
Lambda, but where the backend is Kubernetes rather than AWS. Yes you're still
on a framework, but it's a vendor-agnostic, open-source one. There are some
other attempts at similar things, too. I am partial to this one for now.

[1]:
[https://github.com/kubeless/kubeless](https://github.com/kubeless/kubeless)

~~~
hardwaresofton
This is what I see as the true power of kubernetes -- once people start
developing high quality (hopefully open source) applications for platforms
like kubernetes, providers like AWS should lose their "value-added" benefits,
and be reduced to more like colo providers, maybe offering 24/7 support as
well.

That will be when the ubiquitous cloud truly arrives -- run on whatever
provider in the sky, and as long as they run kubernetes you can run your
workloads there.

------
buildbuildbuild
Their tool which facilitates the packaging and relocation of dynamically
linked binaries is interesting:
[https://github.com/intoli/exodus](https://github.com/intoli/exodus)

"Painless relocation of Linux binaries–and all of their dependencies–without
containers."

~~~
therein
Yeah, seriously. This sounds great.

Also, if you found exodus interesting, you may find the following interesting
too.

[https://github.com/endrazine/wcc](https://github.com/endrazine/wcc)

~~~
yesco
Wow this is amazing, how have I never heard of this before? Thank you for
sharing this.

------
maxk42
Very misleading title. Elastic Transcoder pricing applies primarily to video.
This tutorial only covers the audio transcription which is much, much less
resource-intensive.

~~~
Johnny555
Nothing misleading here, they compared their project's cost to the Elastic
Transcoder audio pricing.

Elastic Transcoder Audio is $0.00450 per minute [1], this article says that
with Lambda it cost "$0.00008273 per minute of audio, a full factor of 54
times less than Elastic Transcoder.".

0.00450 / 0.00008273 = 54

[1]
[https://aws.amazon.com/elastictranscoder/pricing/](https://aws.amazon.com/elastictranscoder/pricing/)

------
jwildeboer
Misleading title. Article is about audio encoding, not video. Better: “Using
FFmpeg on AWS Lambda for audio encoding at 1.9% of cost for AWS Elastic
Transcoder”

------
taion
What's the benefit of using Exodus over just using the official-ish static
builds of FFmpeg?
[https://johnvansickle.com/ffmpeg/](https://johnvansickle.com/ffmpeg/)

This works just fine on AWS Lambda. The `ffmpeg` binary there weighs in at 46
MB. Unless you need something not bundled with that build, it seems like this
is sufficient and is easier to set up.

------
kwillets
It looks like the youtube download has to complete before ffmpeg can start; is
there a way to start processing the head while the tail is still being
written?

This problem comes up a lot with storage blobs. The bigger they are the worse
it is to serialize write/reads.

~~~
RJIb8RBYxzAMX9u
I don't see why not with I/O redirection, or named pipes. It probably wasn't
done for simplicity.

------
throwaway2016a
This is great. I wouldn't have thought to run FFmpeg on Lambda.

I'm going to stick with Elastic Transcoder for now though. I like that I have
no upgrades to maintain and very little code. I feel like if I did this, it
would take me years to recoup the cost even with a 99% savings.

But that is only because I only have a few videos a month. Roughly $1.00 on
Elastic Transcoder. If I had thousands or even hundreds of videos this seems
like a great and worthwhile project. Especially since this article appears to
take a lot of the trial and error and proof of concept out of the mix.

I worked for a large Internet company that had a Netflix like product back in
2007. The transcoders were literally just plugged in underneath people's
cubicles. Kept things nice and warm in the winter and I'm sure the costs were
pretty low.

------
jordan314
Cool but only for up to 8 minute videos. Unless you found a way to parallelize
the lambda tasks.

~~~
pfg
I'm far from an FFmpeg expert, but I believe it's possible to segment the
input video, transcode the segments one by one, and then concatenate them. Not
sure how the segmentation and concatenation steps perform, but if that's fast,
this might even improve your overall transcoding speed due to the
parallelization.

~~~
armen52
Media companies are already taking this approach using ffmpeg, AWS Lambda, and
AWS Step Functions. I heard from two companies using such approaches at AWS
re:Invent in October 2017, so it's definitely possible.

Rolling your own approach like this is certainly more complex to
build/maintain than using Elastic Transcoder though.

~~~
brian_cloutier
If you know that you'll need more than 8 minutes, why wouldn't you just run
ffmpeg on EC2? EC2 is now pay-per-second. I haven't looked at the prices
recently, is AWS lambda so much cheaper that it's worth jumping through all
these extra hoops?

~~~
bigcostooge
You can encode about 1 video per EC2 medium instance without losing >1:1
encoding speed. It’s horrendously expensive.

------
lostmsu
Don't you also have to pay for traffic going to/from lambda? In that case raw
audio and video would be very expensive!

~~~
celerity
Traffic to Lambda from the internet and between Lambda and S3 is free. The
only thing you pay for are the transfer costs from S3 (at cents per GB).

~~~
archgoon
Assuming resulting audio of 3 minutes, then 1000 uses would result in 9 GB, or
about 81 cents. As long as you can get ads for $1 per mille, you should be
good. That said, you'd probably need to implement something to prevent abuse
(single user bypassing the frontend and just spamming your backend).

Looking forward to the next post in the series.

------
tyrankh
When I worked at Panasonic we did this exact thing. It's remarkably easy for
the cost savings.

------
moonbread
I used FFmpeg static build to transcode WAV to mp3, but the latest 64-bit
build gave me corrupt files, so I had to hunt down an archived version. Works
well though!

------
sp332
Is it downloading the whole Youtube video just to pull out the audio? Why not
just download the audio to begin with?

~~~
archgoon
What is the API to download only the audio from a youtube video? How would you
do your proposed solution?

~~~
sp332
Let's take a video I recently linked in another HN comment.
[https://www.youtube.com/watch?v=r_fxB6yrDVo](https://www.youtube.com/watch?v=r_fxB6yrDVo)
If I run youtube-dl -F
[https://www.youtube.com/watch?v=r_fxB6yrDVo](https://www.youtube.com/watch?v=r_fxB6yrDVo)
then I get a bunch of options marked "audio only DASH audio", e.g.

251 webm audio only DASH audio 143k , opus @160k, 78.96MiB

Then if I run youtube-dl -f 251 -g
[https://www.youtube.com/watch?v=r_fxB6yrDVo](https://www.youtube.com/watch?v=r_fxB6yrDVo)
then I get this horrible URL: [https://r1---sn-hxugvj5nu-
cvnl.googlevideo.com/videoplayback...](https://r1---sn-hxugvj5nu-
cvnl.googlevideo.com/videoplayback?mt=1525288098&id=o-ALmpxTJqwTvmb1rI2j8vr7ERIAmvRiXKDCroWbudn5ly&mn=sn-
hxugvj5nu-cvnl%2Csn-
ab5l6n6s&mm=31%2C29&ms=au%2Crdu&ei=Kw3qWriXMuaZ8gTP1Z6gBA&keepalive=yes&mv=m&source=youtube&pl=19&dur=5792.761&lmt=1497010473974174&ip=65.175.128.10&requiressl=yes&clen=82794463&mime=audio%2Fwebm&c=WEB&initcwndbps=1348750&fexp=23724337&ipbits=0&fvip=1&itag=251&expire=1525309835&sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cexpire&gir=yes&key=yt6&signature=08A1A700F7ADB15AB3444DE412054B0A05FEFD35.5A67EC47A23953CB6F9230A30CF3D66D138525F4&ratebypass=yes)

However if I wget "[horrible_url]" -O audio, it still takes forever to
download, so I guess rate-limiting might be the issue. But if download time is
the problem, you could have one server that just downloads the data slowly to
S3 and then kicks off the lambda job on the completed file.

------
marta_moreno
Yes well, now you just have to pay the fee for licensing the codecs FFmpeg
gives you ;). What was it? One million dollars for MP4? Good luck with that.

