
How I Made Porn Video Streaming More Efficient with Python and C - crm416
http://www.toptal.com/python/how-i-made-porn-20x-more-efficient-with-python
======
danso
I sympathize with why the mods changed the title from "Porn" to "Video", but
now you've made it so that HN users are unwittingly clicking into an article
that talks about porn, in such a way that could be detected by a workplace
firewall. That's not an ideal usability decision. Why not use good ol square
brackets to maintain the integrity of the title:

How I Made [Porn] Video Streaming 20x More Efficient with Python and C

~~~
peterhajas
Why do mods change titles of posts? This is frequently complained about on HN,
and seems unfortunate.

~~~
philh
Mostly they make things better, and nobody notices. Sometimes they make things
worse, and people complain.

~~~
enraged_camel
>>Mostly

I don't mean to be a hardass, but... citation needed.

Seriously, we have no data on the number of mod edits made vs. complaints
voiced. As such it seems premature to conclude that mods _mostly_ do a good
job with these arbitrary edits.

------
jallmann
Agreed that RTMP is an abomination that needs to be exorcised from the Earth
as soon as possible. Unfortunately, it is probably here to stay until
something like WebRTC gains critical mass.

It's not clear what the article means by "repackaging" a stream or "pointers"
to tags (especially in the diagram that shows tag pointers being transported
between users). While RTMP is cumbersome, shoving media data (tags) under a
per-session protocol header is essentially the standard way of moving data
from one session to another.

So I'm not really following this. Is this cutting out the RTMP entirely for
receiving clients, and instead sending the FLV down via another transport,
like HTTP or whatever? Or is it more of, "I wrote my own RTMP streaming server
in C and Python", along with some implementation details which I'm not
understanding? (Not that there's anything wrong with doing so. Options are
limited in the streaming-server space.)

~~~
toptalgergely
Author of the original post here.

>It's not clear what the article means by "repackaging" a stream or "pointers"
to tags (especially in the diagram that shows tag pointers being transported
between users).

By repackaging I meant extracting the FLV tags which pretty much travel in the
same format as in an FLV stream (.flv file) if memory serves well. Pointers to
tags refer to the internal implementation. I took the FLV tags out of the RTMP
stream, which resulted in an almost complete FLV stream. With the proper
header prepended to it you could save it as a file and play it or stream it
and play it. That's just what I did, created a header for every new user, and
after that was sent I could just stream the FLV tags from a common buffer. The
users had pointers that pointed in this buffer so after the header is sent it
was true multicasting.

> So I'm not really following this. Is this cutting out the RTMP entirely for
> receiving clients, and instead sending the FLV down via another transport,
> like HTTP or whatever? Or is it more of, "I wrote my own RTMP streaming
> server in C and Python", along with some implementation details which I'm
> not understanding? (Not that there's anything wrong with doing so. Options
> are limited in the streaming- server space.)

Yes. From the source RTMP stream I extract the FLV tags, which I could use to
multicast. Sending the same RTMP stuff to every user would not work, but I can
easily send the FLV tag stream over HTTP if I send the crafted header first.

I hope that helped

~~~
jallmann
> I extract the FLV tags, which I could use to multicast

I assume you don't really mean IP multicast (which would, incidentally, be one
approach to edge-origin mirroring for the popular marketing campaigns you
mentioned, at least within a data center).

Anyway that makes sense, simply sending the FLV is clever. What's the method
of delivery -- chunked HTTP? Does it play well with proxies?

~~~
toptalgergely
No, not IP multicast. Imagine it as a web server that serves FLV files which
always start at the current stream position. This is what it did.

Actually bringing up the proxy issue is interesting. I'm not sure if it does.
Essentially forbidding all caching would make it play nice.

~~~
MBCook
Isn't that pretty close to how HTTP Live Streaming works?

~~~
toptalgergely
You mean the apple protocol? Not sure, but I looked at that when I designed
it. What I did was just the most intuitive

------
hardik988
I appreciate the effort the mods take in curating titles, I really do - but
please spare articles like this at least?

I clicked on this link at work (part-time at grad-school) and now I have a
"how to run a pornographic website faster" link logged in my name.

~~~
sc00ter
Oh, please. The article uses the word porn exactly three times in the
introductory paragraphs. Four if you count the title.

The comments here have double that, including your own use. If you were
genuinely that paranoid about being "logged" for having visited a page that
used the word porn (really?!) I doubt you'd be using it yourself.

~~~
hardik988
I apologize if it sounded that way, but I wasn't insinuating that I'm going to
get into trouble.

I'm just saying that it could get someone into trouble.

As far as the "logging" goes, it had more to do with the word porn in the
title of the page, because many content-filters just parse the title for
blacklisted words.

------
kbenson
I love reading articles about technical issues and solutions in the Porn
industry. It's like getting a peek inside a youtube scale company as they
grow.

~~~
niggler
Are any of the porn sites larger than youtube?

~~~
magikbum
I would rather learn if there was a porn site bigger than Netflix.. as Netflix
currently uses around 30% of US bandwith [1]

1\. <http://www.pcmag.com/article2/0,2817,2395372,00.asp>

~~~
skriticos2
Probably. Netflix has the big constraint of availability that only extends to
the US. Porn is global. Even though the US has a above-average bandwidth usage
(guesstimated) it's still only a small(ish) drop in the global network.

~~~
toptalgergely
EU traffic in our case about ~1.2x the US traffic.

------
coldtea
I'm astonished by some of the responses here, that could be summarised thusly:

> _"I work on workplace, in a modern western society, not some theocratic
> backwater, that monitors my web activity and would frown if I visited an
> article with the word porn in it. This on 2013. I find this OK, and won't
> quit my job or raise hell protesting this degrading treatment, but would
> rather complain for HN titles"_.

In an age where people fight for LGBT rights, this is what the American
workplace has come to?

~~~
_delirium
Unfortunately, the majority of jobs in the United States are run with this
kind of degrading treatment. Many even require you to randomly pee into a cup
on short notice, to make sure you didn't do any drugs recently. Some of them
also monitor your Facebook accounts (as far as permissions allow) to see what
you're up to there. A few require you to hand over your Facebook passwords (!)
to the boss to make that easier. The corporate world is weird and scary, but
not always easy to avoid.

~~~
SilasX
>A few require you to hand over your Facebook passwords (!) to the boss

That was a poorly-source, probably-made-up story that now one could verify yet
quickly became accepted truth. (Unless it's meant to refer to cases involving
heavy security clearances, in which cases it's ho-hum)

------
xSwag
>The aggregated bandwidth of the clusters was around 50 Gbps, from which they
used around 10 Gbps while at peak load.

Now that is a lot a porn.

Also, the illustrations look really good! How did you make them?

~~~
OtherPlanet
Very glad you like those illustrations, first we have created couple of the
sketches on the paper and then recreated that in the Photoshop.

------
toptal
Hey this is awesome! Though the admin/moderator changed the title for some
unknown reason from the title of the blogpost to their own.

~~~
vsh
Of course because of snobbery. NH visitors should be protected from words,
which starts from "p" and ends with "orn".

~~~
niggler
"NH visitors should be protected from words, which starts from "p" and ends
with "orn"."

popcorn? preworn?

~~~
jfb

      % egrep '^p.*orn$' /usr/dict/words

~~~
jvoorhis
I think you mean /usr/share/dict/words.

~~~
jfb
Yep. That's what I get for trying to be clever.

------
ianhawes
If anyone is interested in an alternative to the usual RTMP servers (FMS,
Red5, and Wowza), I highly recommend EvoStream (<http://www.evostream.com>).
Compared to the alternatives, EvoStream is much more efficient. I believe
TinyChat published a whitepaper discussing their transition from Red5 to
EvoStream, which resulted in a decline in the number of required servers.

EvoStream is a highly scalable streaming media server written in C++ based off
of the open source RTMPD (<http://www.rtmpd.com>). The commercial company,
also called EvoStream, is a relatively new startup and they do great custom
work for those not familiar with streaming media/RTMP.

------
lmm
It's easy to underestimate the power of switching to a better language by just
doing it - guess at the syntax until it works, then refactor as you start to
understand the language and its culture more. In fact I've found I learn
faster this way than any other.

------
chopsueyar
Is this similar to what Wowza (wowzamedia.com) does?

In terms of licensing, it is $55/mo/instance or $995 for a one-time license.

There are also EC2 instances that start at 15 cents per hour.

Much less expensive than FMS

~~~
mtrimpe
You also have Red5 which is open source and rocks very much. It is just a
-tad- less reliable than Wowza which you can't really sell to paying
customers, but it's much more fun and customizable.

And to answer your question, Wowza is just a cheaper FMS which also supports
other platforms next to RTMP. This article is basically taking the more
performant HTTP download mechanism for static content (like YouTube uses) and
then hacking it to put a live stream in it instead.

------
dangayle
I got hit with a firewall and a "This event will be reported".

Thanks, HN Moderator.

------
buster
Strange how 3/4 of the comments are more concerned with the title link,
instead of the actual content.

Very nice post, was an interesting read!

~~~
tekacs
To be fair I clicked on the article and immediately came back to the comments
expecting the absolutely inevitable sprawl of comments about the title edit.
:P

So not strange, methinks. :P

------
nacho2sweet
LOL at everyone who works at places that fear that the company firewall saw
them read an article that had the word "Porn" in it.

Hopefully you get in trouble and fired, it will help your life in the long
run.

------
latchkey
As someone who built the infrastructure for serving porn for Kink.com, I'd say
that this was a total waste of time. Spend the money on a third party CDN and
serve from there.

~~~
kibibu
The article suggests that this is for live streaming shows. Would a CDN based
approach work in this use case?

~~~
latchkey
Take a look at KinkLive.com. We were the first porn company to do live
streaming in HD using a CDN (Bitgravity), all paid for, by the minute, with a
micro currency system (kinks) that we built.

~~~
kibibu
Very interesting! Thanks for your reply. Have you ever posted or discussed
your infrastructure before?

~~~
latchkey
I've posted about small parts of it in various places, but not the whole
experience. I no longer work there (since April 2010), so while I know they
still use quite a bit of the serving infrastructure that I built, my knowledge
is now quite a bit out of date.

One fun bit that I built is called the cockblocker (as you can imagine, we
used all sorts of fun names for internal projects). People who repeatedly
attempt to hack the system (usually through various forms of abuse like failed
login attempts) would automatically get their IP address routed to /dev/null.

------
mezeek
This title and the article's title don't quite match...

~~~
crm416
Submitted title matched the page, but it's been changed by the mods

------
tarikozket
What service do you use for flowcharts?

------
ConceitedCode
Is there an open source project that accomplishes something similar?

~~~
bochi
There is on open source a Nginx module that streams RTMP and creates HLS
segments that looks promising: <https://github.com/arut/nginx-rtmp-module>

There are also red5 (<http://www.red5.org/>) and rtmpd
(<http://www.rtmpd.com/>)

------
jneal
I know others already said this, but I clicked on this link from work and
immediately became appalled when realizing it was about porn and quickly
backed out. I'd gladly read this from the comfort of my own home and I'm sure
the content itself is SFW but still the point is I was mislead. Can we please
[mods] not change titles in cases like this?

~~~
Myrmornis
Don't understand. You can browse articles for leisure at work. So what was the
problem with this article?

------
jeremyx
What is the association with this site toptal.com? Has anyone here ever worked
with them before?

~~~
toptal
Hey Jeremy, this is our engineering blog.

~~~
jeremyx
Thanks. This is the only article on the blog? I was actually intrigued by the
site and was wondering if anyone on HN has used this service as a dev or a
client....

~~~
elbear
I was in talks with them as a dev. Their requirements were a bit weird, so
that's why I didn't sign a contract in the end. I think there was one
requirements which asked that the developer answer the employer's message
within a pretty short time frame, not matter the timezone difference.

~~~
toptal
Hey Elbear, that's not in our contracts, at all. We do make sure our
developers communicate within 10 hours regardless of the timezone. We feel
that is more than reasonable and it's worked extremely well for Toptal.
However, that's not in our contract, it's actually something we simply stay on
top of as a company. Our requirements are not "weird" at all, they enforce
high integrity, and in most places in the world, the concept of high integrity
is "weird". The type of people who will not conform to such standards are
precisely the types of individuals who we would never want to work with in a
million years, and that is _precisely_ why freelancing platforms have such a
pervasively bad reputation.. because they're filled with low or even medium
integrity individuals. To us, that's unacceptable. As it is to any A-player
team. -Taso, CEO, Toptal

~~~
analog
Doesn't this mean that your employees are expected to be on call for 14 hours
per day though?

~~~
toptal
Your version of "on call" is not congruent with ours or most of the world.
When a doctor or DBA is "on call", they have to get up, fix something, and
spend hours if not days doing it. We simply enforce communication. If you want
to call that "on call" then your decision, and we disagree with that
definition. To answer you question, yes, if you're responsible (in our
subjective definition of what constitutes responsibility), you will answer
within around 10 hours. In practically all communication you can reply "I got
this I'll answer you later." or something similar to show responsiveness. The
word responsibility stems from "response ability" and we believe that everyone
should be... responsible. That's in our DNA.

------
i5rider
Great work. I worked with RTMP for some time and I recall the pain to reverse-
engineer their protocol. One other comment, maybe the title should be more
like: "How I improved a slow/inefficient RTMP video streaming service by 20x".
Just a thought.

~~~
hackerboos
Would he get the same CTR if he changed the title?

------
ckdarby
Nothing open source in this article which was the biggest drawback of me
reading this.

~~~
toptalgergely
Unfortunately that's the case. I heard really good things about wowza these
days. Not open source but a lot of things are open and flexible. Also back
then red5 was also an ok alternative, not sure if that's still the case.

------
proland
If you're going to change the title, at least add NSFW to it...

~~~
specialp
How is it NSFW? It is an article that is about video streaming on scale. Yes
it does mention it was for a porn site, but it really does not have much to do
with the article. If you work somewhere where this would be viewed as
inappropriate I am not sure what can be viewed at that employer.

------
ericcholis
The network graphics are pretty sharp, anybody familiar with where they came
from? Or, are they custom?

------
mariuolo
Why is the type of content relevant? Perhaps there's something peculiar about
porn users?

~~~
protomyth
Because, like it or not, the porn industry dealt with a lot of the technical
issues of selling video, pictures, and video streams on the internet before
anyone else. They needed it to be profitable / have actual revenue.

------
coherentpony
Congratulations to whoever messed with the title; I clicked this at work.

------
serginho
I've never expected that the story would be about porn.

------
atechnerd
More efficiently? For two minutes of video? I could see if you were trying to
stream Zombieland or Spaceballs... but porn?!

I guess whatever inspires people to innovate is okay with me.

~~~
toptalgergely
Original writer here.

Actually this was a website which streamed online broadcasts live from people,
like ustream does but with adult content. Some broadcasts could be hours long.

