
Show HN: WebRTC Insertable Streams and server processing - Sean-Der
https://github.com/pion/webrtc/tree/master/examples/insertable-streams
======
Sean-Der
Hey HN!

This is a little demo of WebRTC insertable streams. This API has the
possibility of opening up some exciting with WebRTC.

The gist of it is that you have access to the video frames in the browser now!
You can see here [0] flipping bits in the browser (a simple XOR Cipher) that
was applied here [1] in the Go code.

The use case I am most excited for is attaching metadata to frames. Especially
for teleoperation a big ask has been attaching metadata to a specific frame,
this makes it possible!

I also have a video demo on Twitter here[2] but not that exciting

[0]
[https://github.com/pion/webrtc/blob/master/examples/insertab...](https://github.com/pion/webrtc/blob/master/examples/insertable-
streams/jsfiddle/demo.js#L28-L38)

[1]
[https://github.com/pion/webrtc/blob/master/examples/insertab...](https://github.com/pion/webrtc/blob/master/examples/insertable-
streams/main.go#L97-L100)

[2]
[https://twitter.com/_pion/status/1271956810015010816](https://twitter.com/_pion/status/1271956810015010816)

~~~
amelius
Hi, perhaps you could be a little more explicit in the README about what you
mean by "insertable stream" for people without the background. Is this like a
dataflow-graph where you can combine streams using generic "operation"
nodes/functions?

> The gist of it is that you have access to the video frames in the browser
> now!

As a total newbie in this area, it sounds surprising that this wasn't already
the case. Would you access these frames from Javascript or WebAssembly? Would
this be efficient enough (in terms of speed and power use)?

------
xorcist
Pion seems promising!

Is there anything like Jitsi built on Pion?

(I have been using Jitsi a lot lately. It is great software and a good example
that free software can be just as featureful as the proprietary alternatives,
but also a pretty complex software stack on it's own, there's a lot to learn
just to extend it.)

~~~
Sean-Der
Thank you :) Yes! There are three 'media servers' that I know of. You can also
find more stuff in awesome-pion [2]

* [https://github.com/pion/ion](https://github.com/pion/ion)

* [https://github.com/peer-calls/peer-calls](https://github.com/peer-calls/peer-calls)

* [https://www.irif.fr/~jch/software/sfu/](https://www.irif.fr/~jch/software/sfu/)

Ion (the first one) was built with extensibility in mind. Each service runs in
its own container, so you can just fork ion-app-web [0]. You continue to get
the backend improvements though.

We are working on adding RTMP/SIP bridges as their own nodes. There is also a
generic 'AVP' node that will allows media processing. You can save to disk, or
other custom stuff you define.

If you are interested would love to have you involved. Hop on Slack [1] or
feel to just leave ideas/feedback on Github!

[0] [https://github.com/pion/ion-app-web](https://github.com/pion/ion-app-web)

[1] [https://pion.ly/slack](https://pion.ly/slack)

[2] [https://github.com/pion/awesome-pion](https://github.com/pion/awesome-
pion)

------
ex3ndr
I am kinda annoyed by VP8. It is so widespread and so awful due no support in
hardware. I think even latest macos doesn't support hardware encoding. Once
video is on VP8 iPhone will became hot, 5 year old MBP turns on all fans.

Just switch to H264 and everything works just fine.

Quality is not that different.

Why everyone is obsessed with it?

~~~
j1elo
> _latest macos doesn 't support hardware encoding. Once video is on VP8
> iPhone will became hot, 5 year old MBP turns on all fans_

I don't understand. Why isn't your question " _Why is Apple so extremely late
to the party?_ " ?

An iPhone becoming hot with VP8 is caused by Apple not improving VP8 support,
not by any inherent quality of H.264. They have been denying for quite a lot
of time that the industry did standardize and make _mandatory_ the support for
_both_ H.264 and VP8 if you want to implement WebRTC. Apple ignored that and
until very recently, they decided to ignore half of those codecs.

If I had to guess, maybe "everyone" is "obsessed" with VP8 because H.264 is
encumbered by royalties and stuff. VP8 is royalty-free, and its successor VP9
is too.

~~~
ex3ndr
What? If your CPU have H264 then you shouldn't pay any licenses. You will have
it in any CPU everywhere.

It is not apple late to the party, it is just doesn't make sense to implement
VP8 in silicone. Also burning macbooks is Intel issue, not mac. Or may be
chrome team are not using hardware acceleration.

VP8 is supported for sure, but this codec is subpar comparing to H264. I can
easily encode H264 even on Raspberry Pi Zero and i can't do the same for VP8
at all.

Also average engineer doesn't care about royalties, but still everyone is
using VP8 as default.

~~~
Orphis
Disclaimer: I work on WebRTC somewhere in the codec stack. Certainly not the
most knowledgeable in that field but here are a few reasons.

Software encoders will always have more features than hardware encoders, which
are prone to bugs.

A bug in a HW encoder needs either a new driver or new silicone. You can't
work with those and sometimes you can't even detect which version it is.

So if you want to ship a reliable application, you can either use a software
codec (which you control) or hope the HW encoder which is totally random works
as intended.

HW encoder bugs show in many forms. Sometimes the stream is just broken and
can't be decoded, sometimes it won't listen to encoding parameter updates you
send it (eg max bitrate or quality settings), sometimes it just generates a
valid but bad stream (too many I frames, breaking the max bitrate setting),
sometimes they have a TERRIBLE latency.

Sometimes, they just plainly lack features, for example can't encode temporal
layers, don't have any denoising (useful for VP8), have a limited amount of
instances limiting simulcast opportunities, have extreme frame size
limitations (can't do screensharing with it)...

And that's just what I could come up with in a few minutes of thinking, I'm
sure people who have worked on it for a much longer time would be able to say
much more.

~~~
j1elo
We could argue that those issues would apply exactly the same for both H.264
and VP8 hardware-based encoders, so these aren't reasons for lacking VP8
encoders in phones and other devices... otherwise we wouldn't have H.264
hardware either.

I agree in spirit with the parent comment (although judging from my post it
would seem the contrary). Ideally, both codecs would have hardware encoders.
The difference in performance and battery savings are huge. I just understand
that the industry players felt that it was necessary to provide a royalty-free
option among the proposed codecs... either that or it was just Google pushing
their own thing, which also sounds very likely.

~~~
Orphis
They do apply to all codecs, absolutely. Some just have historically more
usage, so the encoders might behave better. There are still bugs though, I do
remember hearing about some device producing an H264 stream that another
device couldn't decode in HW. It's still not perfect.

Also, as many people have pointed out, VP8 is a more complex codec than H264,
so harder to test coverage of all features, especially as most applications
(especially on desktop) are not really doing RTC, which is quite a specific
use case.

And phones DO have VP8 encoders. It's just that iPhones for historical reason
don't have one. Pure speculation from my part, but I'm guessing they'll do AV1
someday as Apple is working on the spec.

But in general, even though HW encoders would be awesome for all codecs if
they were working well in all situations, the reality is they don't always
work, you usually need to build lists of HW that is supposedly behaving well
with your use case. All advanced features (eg spatial layering) are tricky to
implement in HW and are usually not configurable, so SW codecs will still have
quite a lot of usage for the sake of quality.

------
TomMarius
How does WebRTC go together with HTTP/3 and QUIC, or is it unrelated at all?

~~~
pthatcherg
Hey, I used to work on this :).

It's complicated :).

Basically, WebRTC is a combination of a bunch of protocols: ICE, DTLS, SCTP,
and RTP. You could theoretically reduce that to ICE + QUIC for p2p use cases
and just QUIC for client/server use cases.

For p2p, there is RTCQuicTransport (see
[https://developers.google.com/web/updates/2019/01/rtcquictra...](https://developers.google.com/web/updates/2019/01/rtcquictransport-
api))

For client/server, there is QuicTranport (see
[https://web.dev/quictransport/](https://web.dev/quictransport/))

Of course, you'll probably want to be able to encode and decode some audio and
video as well to make that useful. For that, there is WebCodecs (see
[https://github.com/WICG/web-codecs](https://github.com/WICG/web-codecs) or
[https://www.chromestatus.com/feature/5669293909868544](https://www.chromestatus.com/feature/5669293909868544))

For use cases like live streaming and cloud gaming, I did a presentation about
the combination of WebTransport + WebCodecs:
[https://www.w3.org/2018/12/games-
workshop/slides/21-webtrans...](https://www.w3.org/2018/12/games-
workshop/slides/21-webtransport-webcodecs.pdf)

And then there is the work happening in the IETF along these same lines:
[https://datatracker.ietf.org/wg/ript/documents/](https://datatracker.ietf.org/wg/ript/documents/)

~~~
TomMarius
Thank you and others for answering me! :-)

