Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: 30ms latency screen sharing in Rust (github.com/bitwhip)
353 points by Sean-Der 9 months ago | hide | past | favorite | 69 comments



I wrote this to solve a few things I cared about.

* I want to show people that native WebRTC players can be a thing. I hope this encourages hangouts/discord/$x to implement WHIP and WHEP it would let people do so much more

* I wanted to make low latency sharing easier. I saw the need for this working on adding WebRTC to OBS and Broadcast Box[0]

* I wanted to show devs what a great ecosystem exists for WebRTC. Lots of great implementations in different languages.

* Was a bit of a ‘frustration project’. I saw a company claiming only their proprietary protocol can do latency this low. So I thought ‘screw you I will make an open source version!’

[0] https://github.com/glimesh/broadcast-box


Hey Sean, we both worked at Twitch Video but I left just as you were joining. I currently work on the Discord video stack and am somewhat curious about how you imagine Discord leveraging WHIP/WHEP. Do you see it as a way for these clients to broadcast outwards to services like Twitch or more as an interoperability tool?


Users what to send WHIP into discord. The lack of control on screen sharing today is frustrating. Users want to capture via another tool and control bitrate/resolution.

Most Broadcast Box users tell me that’s their reason for switching off discord.

———

With WHEP I want to see easier co-streaming. I should be able to connect a room to my OBS instance and everyone’s video auto show up.

I don’t have this figured out yet. Would love your opinion and feedback. Wanna comment on the doc or would love to talk 1:1 ! siobud.com/meeting


What's the plan for Wayland compatibility? For a little while I was able to share a single app - but not the full desktop. Now I can't share anything from Ubuntu 24.04 when using Wayland :(


I had to tweak a few things to make it work on my Alpine Linux. This tutorial helped a lot: https://github.com/emersion/xdg-desktop-portal-wlr/wiki/%22I...


Another things that is in this realm. I am adding native co-streaming/conferencing to OBS [0]. OBS can send WebRTC, next I want to make receiving work well.

Between that and Simulcast I hope to make real-time video dramatically cheaper and easier

[0] https://docs.google.com/document/d/1Ed2Evze1ZJHY-1f4tYzqNZfx...


This would be fabulous, thank you so much for working on that. What kind of latency does dual encoding (on client then on receiver again) adds? Are there codecs that can have multiple streams on the same image (as in zones of independent streams on the video surface)?


It definitely adds latency, not enough to be a bad experience.

We have vdo.ninja today and Twitch's Stream Together. Those both do the 'dual encoding' and it is a good enough experience that users are doing it!


Is it possible to extend OBS into a p2p video chat client?


I don't believe it is possible today, but I am working on it! It would be great for users trying to co-stream.

I created OBS2Browser[0] that is a good first step in the direction.

[0] https://github.com/Sean-Der/OBS2Browser


Great things are accomplished by spite programming! https://hackaday.com/2018/01/03/spite-thrift-and-the-virtues...


Question, 30ms latency sounds amazing but how does it actually compare to "the standard" sharing tools for desktops, like do you know what the latency on say MSRDP is as comparison or VNC?


I doubt the protocol itself makes a big difference. I bet you can get 30ms with VNC. The difference with BitWHIP.

* Can play WebRTC in browser. That makes things easier to use.

* simpler/hackable software. BitWHIP is simple and uses nvenc etc… if you use nvenc with VNC I bet you can get the same experience


Thanks the code is really useful to read through.


Any plans on integrating L4S with e.g. Tetrys-based FEC and using a way where the congestion feedback from L4S acts on the quantizer/rate-factor instead of directly on bitrate?

It's much more appropriate to do perceptual fairness than strict bitrate fairness.

Happy to have a chat on this btw; you can best catch me on discord.


Depends on the network, surely? Lots of applications for low latency video where you are not sharing the channel, but it has a fixed bandwidth.


E.g. "Low Latency DOCSIS"[0] and related, WiFi[1], support it and with the former it's about non-exclusive scarce uplink capacity where cross-customer capacity sharing may rely on post-hoc analysis of flow behavior to check for abuse, switching to forced fairness if caught by such heuristics. For downstream it's even more natural to have shared capacity with enough congestion to matter, but often only the WiFi side would have a large discretionary range for bandwidth scheduling/allocation to matter much.

Apple already opportunistically uses L4S with TCP-Prague and there are real-world deployments/experiments [2] with end-to-end L4S.

Fixed-

[0]: https://github.com/cablelabs/lld [1] relevant excerpt from [0]: Applications that send large volumes of traffic that need low latency, but that are responsive to congestion in the network. These applications can benefit from using a technology known as "Low Latency, Low Loss, Scalable Throughput (L4S)". Support for this technology is including in the LLD feature set, but is beyond the scope of what we have in this repository. Information on L4S can be found in this IETF draft architecture.

[2]https://www.vodafone.com/news/technology/no-lag-gaming-vodaf...


This is awesome. I would love if you had some examples on how to use AntMedia as a source. I am mostly in video engineering so the read the source comes slower to me. This would be really handy in many cases.


Is the restriction to NVIDIA necessary for the low latency?


Nope! I want to add all the other flows.

nvidia is the absolute lowest I believe. I wanted to do it first to know if it was worth building.


> I saw a company claiming only their proprietary protocol

Did the company have a “ripple” in its name? Curious


Let me find it again! I saw it on LinkedIn and it was such a bullshit promo thing


I remember around two years ago, we got in touch with a company—without mentioning the name but it has "ripple" in it—and after an hour-long seminar, NDA, password-protected binaries, and other BS, they barely delivered ~150ms latency..


As someone who setup a discord streaming like service to use alongside Mumble, this is very exciting. I couldn’t get anything involving webrtc working reliably, but the only broadcasting clients I found were web browsers and OBS, so I am interested to see how this compares!

What I eventually settled on was https://github.com/Edward-Wu/srt-live-server with OBS and VLC player, which gives robust streaming at high bitrate 4k60, but latency is only 1-2 seconds


Excited to hear what you think! If there is anything I can change/improve tell me and will make it better :)


Couldn't get it to work in Windows 11. Was able to run the just install script only after editing it to use the full path to the 7zip binary. Said it installed correctly, but then when I try to do `just run play whip` I got this:

  cargo:rustc-cfg=feature="ffmpeg_7_0"
  cargo:ffmpeg_7_0=true

  --- stderr
  cl : Command line warning D9035 : option 'o' has been deprecated and will be removed in a future release
  thread 'main' panicked at C:\Users\jeffr\.cargo\registry\src\index.crates.io-6f17d22bba15001f\bindgen-0.69.4\lib.rs:622:31:
  Unable to find libclang: "couldn't find any valid shared libraries matching: ['clang.dll', 'libclang.dll'], set the `LIBCLANG_PATH` environment variable to a path where one of these files can be found (invalid: [])"
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


Looks like you need libclang for the ffmpeg bindings.


Looks like the install script is incomplete and fails to check for and install all prerequisites.


Sorry about that! I will continue working on improving it.

I am also going to drop the 7z usage. Powershell has unzip built in


What is the reason for using "just" here?

I understand people have their tooling preferences, but this looks like something that build.rs or a plain makefile could have handled?


I was also wondering if anyone could chime in on advantages of using just.

I'm familiar with makefiles, is there a particular advantage to using just over makefiles or is it personal preference? (which is a totally valid answer! I'm just wondering if I'm missing something)


I think that the appeal of just is that it is simpler than make. It is not checking timestamps of files, but executes a DAG of tasks unconditionally.


My first thought was that that was dropping one of the main features of make.

On reflection though, the timestamp dependant part isn't really something used much nowadays apart from compiling C.

It'd be cool if it was an opt-in feature for just files so that it could actually function as a replacement for make in all cases.

I went looking in the docs and found this[0] which I'd missed last time I looked into justfiles.

[0] https://github.com/casey/just?tab=readme-ov-file#what-are-th...


I don't really buy his justification that ".PHONY: xxx" is hard to remember so we should have a completly new tool instead.

Make has its issues, but it also has two big advantages: it's simple and everyone already have it.


Everyone already has it... on Linux and Mac. It's pretty rare for it to be available on Windows.

That said I kind of agree. I like the idea of `just` but it does seem like they have just created a complicated DSL.

I think it is better to just write your infra scripting in a real language. I generally use Deno or Rust itself and a thin wrapper that `cargo run`'s it. Using Rust eliminates a dependency.


Anyone who's halfway serious about software development on Windows surely has make there too, and it's not like non-developers are the target audience for 'just' scripts


> Anyone who's halfway serious about software development on Windows surely has make there too

Not even remotely. I know it might be hard to imagine if you only program on Linux/Mac but there's a whole world out there that isn't built on janky shell scripts and Makefiles. If you use C# or Java or Visual C++ or Qt on Windows it's pretty unlikely that you'd have Make. It's kind of a pain to install and you don't need it.


Literally zero of the hundreds of devs I know that do software development on windows have make installed. Why would they? It's not usual in the space at all, that's msbuild


I agree, and even more strongly: you don't even need to remember .PHONY as long as your target names don't overlap with actual filenames, which is usually easy.

In fact, I didn't even know about .PHONY and have used make for a long time. That's what's great about it, even if you stick to the most basic features make is incredibly easy and straightforward. Dare I say, it "just" works lol.

I hate the proliferation of new tools that are the same as a tool that's been around for 20 years and is no different in any significant way except being trendy. Just unnecessary entropy. Our job is to manage and reduce, not maximize entropy.


> it's simple and everyone already have it.

Not always, Go programmers for example often forget that they need C build-tools for their platform to get Make.

It's also just about the furthest thing from simple, the language is nasty so people just use it as an executor, which is a lot of tooling for such a simple use-case.


Also this:

>The explicit list of phony targets, written separately from the recipe definitions, also introduces the risk of accidentally defining a new non-phony target.

... seems to think the only way to define phony targets is:

    .PHONY: foo bar
    foo:
       ...
    bar:
       ...
... which has the problem that bar's definition is distant from its declaration as a phony target. But this form is equivalent and doesn't have that problem:

    .PHONY: foo
    foo:
       ...
    .PHONY: bar
    bar:
       ...
This ability to declare dependencies of a target over multiple definitions isn't even unique to `.PHONY`.


Wouldn't a shell script work just as well than?

I'm not against new better tooling, but I also want to keep my dev machine reasonably clean.


Shell scripts don't work well on Windows.


Even powershell sometimes with execution policies


I would just use WSL then, if native windows dev tooling is such a shit show


I recently switched my (small) company over to using just files within our codebases and it's been going over very well thus far.

We're building a set of apps that need to run on Linux, MacOS, and Windows so having a consistent solution for each is better than shell scripting and I personally have never felt great about make and it's weirdness.

It also helps that we have a pretty big monorepo so that anyone can bounce from one app to another and `just run` to use any of them, no matter the platform.

Either way the justification for me came from COSMIC[0].

[0] https://github.com/pop-os/cosmic-epoch/blob/master/justfile


John did all the work on this.

Just is nice as a Windows user. When I started committing everything worked really well already. Editing the just stuff also is really easy. Much nicer to read then scripts I think


Ooh, I’ve been looking for a good solution for this for years. Currently I use Parsec, but it’s closed source and not compatible with direct streaming from OBS etc. I’ll definitely check this out.


Always a bit sceprical when it comes to latency claims, especially in the sub 100ms space, but screen sharing 1-1 or video ingest should be a great use case for WebRTC

WebRTC is a great technology, but it still suffers from a scaling problem that is harder to resolve. On top of that, the protocol itself does not define things like adaptive bitrate switching or stalling recovery

Curious to hear what you think of some (proprietary) options for low latency playback like LLHLS LLDASH, WebRTC or HESP


WebRTC has congestion control and Simulcast/SVC, what is missing for adaptive bitrate switching. What is stalling recovery? I believe NACK/PLI handle this?

WebRTC doesn’t have a scaling problem. I think it was a software problem! Twitch, Tencent, Agora, Phenix all do 100k+ these days

I like WebRTC because of the open-ness of it. I also like that I only need one system for ingest and playback. I am HEAVILY biased though, way over invested in WebRTC :) I tend to care about greenfield/unique problems and not enough about scaling and making money


I wrote a blog post about how numbers like "30ms latency" are thrown around called "How to lie about latency": https://www.obe.tv/how-to-lie-about-latency/

It's left as an exercise to the reader which methods of lying are being used in this case.


Amazing work! The most I could achieve is ~40ms of video streams, although it was over a cellular network from a drone. But 30ms is a new milestone! I will see if I can repurpose this and test out a real-time video stream from a robot if I get some spare time.


I don't get what it does, exactly? This doesn't seem to be an OBS alternative (judging by the description), but… I mean, isn't it exactly the same as just running OBS directly?


Looks like a LAN tele…er, screen sharing server/client. Presumably you could serve over the internet but it will not get the 30ms latency. Aside from the streaming (I only spent a few minutes reviewing the source) it’s a live jpeg kind of thing. I built something similar to screen share with my kids when we played Minecraft together. It was really for me because once we got in game they would take off and in 5 minutes be screaming for help 10 chunks away in some zombie skeleton infested cave at or near bedrock. Being kids, I never got good enough directions to help them in time. Anyway, it was a fun project. I used CUDA and could get 60fps per client on CAT5 and 45-ish over WiFi, dropping to 10-15fps when I walked in and out of rooms with the laptop. 60fps is 15ms, so 20 is 50fps.


>Presumably you could serve over the internet but it will not get the 30ms latency.

Indeed, you'll have to live with something like 80ms to 100ms latency over the internet and a horrifying 160 ms if you want to have things respond to keyboard and mouse inputs.


Then how does something like moonlight, parsec, or Geforce Now work? Sub-10ms latency, sometimes even sub-5 depending on time of day and network congestion.


Ever heard of the Akamai network? Netflix might be a good example. Trace routes show latency between network hops. To reduce latency you either buy better network hardware, buy better cabling, or reduce hops in n the network. Since the first two are more expensive than the third, if your service must have very fast response between server and client, move the server closer to the client. Large corporations run cache servers in multiple data centers everywhere geographically so the response time for clients is better than their competition. Why new video services struggles to compete with YouTube is in part because YouTube can afford this kind of architecture where a startup cannot. Even if it’s the best code money can buy, it will never provide the same level of experience to users as local cache servers. Kind sucks not body can compete.


GP is talking about playing remotely rendered video games, so no caching here but yes distance to server indeed.


That is only ever theoretically possible with a direct fiber connection between both nodes at <= 200 miles between.

So the answer is “there’s a data center in your city.”


It is also a player!

You can either pull the video from a WHEP source or run in a P2P mode. I wanted to demonstrate the flexibility and hackability of it all :)


mostly glues two libraries? ffmpeg from capture, play and whip?


Yep! It glues ffmpeg, str0m[0] and SDL together. I hope bitwhip doesn’t need to exist someday. When WHIP/WHEP has enough traction it will be easier to land in FFMPEG

[0] https://github.com/algesten/str0m


How does this compare with Moonlight?


Can this be used as remote desktop?


Yes! I want to add remote control features to it. Lots of things left to do

Any interest in getting involved? Would love your help making it happen


Why is the 30ms significant?


vdo.ninja is another excellent alternative but I'll definitely check this out!


[flagged]


It's not about proving anything, I still find it informative knowing which platform a specific implementation or solution is based on. Seen any screen sharing apps written in Erlang/Elixir lately? Such a contraption would be highly curious and interesting to me.

Otherwise in this instance it'd have been reduced to merely "Here's a thing". Pretty dull and boring.


I like it as a signal for “probably cares about some things that most devs don’t bother to care about”. Speed/responsiveness, for example, in this case.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: