In 2015 I wrote a whole Contact Management app for a business that let their entire sales team of like 50 reps pick up phone calls from the browser or from their desk phone using Twilios SIP trunk service. It was pretty cool, it routed the calls to both our internal Asterisk server, and worked flawlessly with Twilios WebRTC libraries in the web app. It was really cool and surprisingly easy, also all of the call recordings instantly became available in the web app and they could initiate calls from their call calendar in the app.
I really like Twilio's APIs and how simple they make stuff like this. You can clone a repo of theirs and get a video call in your browser / on your phone in about 5 minutes. So smart.
> Lastly, we’ve decided to end-of-life (EOL) Twilio Programmable Video as a standalone product. Given it’s such a niche area and a relatively small part of our portfolio, we believe partnering with video industry leaders is the best way to ensure long-term product innovation for our customers. Removing Programmable Video from our portfolio will also allow Communications to more effectively focus on our pillar products - Messaging, Voice, and Email.
I want to embed phone calls (and store the conversation details) within my custom webapp which is a light CRM. Is there any simpler SaaS plugin for that? or perhaps just Twilio API is good enough?
Really exciting times for WebRTC in general right now!
If you are new to WebRTC and want to learn more about the protocol check out [0] (would love peoples feedback). It is used in lots of unexpected places like streaming (added to OBS)[1] and Embedded [2].
I am especially excited with new implementations popping up like [3] and [4].
Thanks for the reply and apologies I don’t see noti for replies.
That’s really interesting, I must test it out sometime, 120ms is still pretty good. A previous test I did before was gstreamer (on an embedded nvidia jetson) sending to a laptop running OBS, latency as low as 40ms, but the problem is the need to use OBS on the viewer side (or gstreamer directly), but would love to integrate it within the webUI I made.
It would definitely be a useful feature for sure, however, the way I plan to use it (or hopefully it will work with broadcast box) is to install it on the SBC nvidia and use it to send the low latency stream to the p2p webUI client, instead of how I did it last time where gstreamer on the SBC sent the stream to OBS (note that OBS had the gstreamer plugin). In that test, I got the best results for latency. Any other protocol resulted in ~2sec, which was bad for the application I needed. I did try webrtc back then, but it was not as efficient as using gstreamer directly as sink/source pipelines, so if I can have in my tests: Broadcast box (on SBC) -> webUI that would a big step forward as I need to get rid off OBS, if I tried that soon will definitely update you!
This is using sipgo as a back-to-back user agent (B2BUA). There are really two sessions going on: one between the SIP caller and sipgo, and a second between sipgo and the browser. These sessions each have their own signaling and media.
The interesting part is how the signalling is happening between the sipgo server and the browser. You copy your session description from the browser and paste it to the server! In a "real" application, you could send it over a custom API call or something, or you could use a standard protocol for initiating sessions (SIP!).
This is all to say, "bridging" between SIP and WebRTC is not nearly as magical as it is often made out to be. In some cases, the media can even be passed through without modification. (Although many SIP endpoints do not support e2ee and so a B2BUA is still needed to wrap/unwrap the encryption.) We were doing this a decade ago with SIP.js and OverSIP.
> This is all to say, "bridging" between SIP and WebRTC is not nearly as magical as it is often made out to be
100% agree. That is what I have been trying to solve with Pion/WebRTC for the Curious. It is portrayed as being harder then it is. Why people do that I don't know. It causes lots of problems though.
* Developers are hesitant to get into WebRTC/VoIP because they think they couldn't figure it out
* Companies are pushing proprietary alternatives and saying 'existing protocols are impossible to use'
I am sure that some wonderful ideas never got built because the education/community just wasn't welcoming enough.
As others have pointed out, this has been a very mainstream, and largely turn-key use of Kamailio, FreeSWITCH, etc. for over a decade.
I am closely involved with Kamailio, and I believe its websocket module, meant to effectuate this functionality, was introduced somewhere in the 2011-13 time frame.
While of course the ecosystem can always stand to be richer, there's nothing new here.
Side note: "WebRTC to SIP" isn't an especially intelligible formulation, as WebRTC does not prescribe a signaling protocol. SIP can be used on the browser side, and often is (for interoperability benefits both real and imagined), but needn't be necessarily.
Pion's approach to the problem is different to the software you have described.
Instead of configuring a server you are given primitives (SIP and WebRTC in this case). I find this powerful when trying to implement use cases that go off the beaten path. I think a lot of the RTC software of this generation is a reaction to that.
----
> "WebRTC to SIP" isn't an especially intelligible formulation
This is the language I see most developers and customer use. What do you think a < 10 char name would be?
----
I don't think this attitude is healthy for the space. If I was a new developer looking at different spaces this would make me not want to be in VoIP/RTC. The goal of my example was to get people excited and try to make WebRTC and SIP more accessible. What did you hope to accomplish with saying "others having been doing this for a decade" and "isn't an especially intelligible formulation"?
> What did you hope to accomplish with saying "others having been doing this for a decade"
The charitable interpretation is that the point is to let people know that there are several ways to do something similar, and there are off-the-shelf modules that can be used to accomplish it, if you don't want to use a browser. (The less charitable, but understandable interpretation probably runs along the lines of, "we've been doing this for a decade and it's frustrating that we haven't gotten enough recognition for it, to the point that the subject of this HN post seems to be striking people as quite unlike anything others have done before".)
> and "isn't an especially intelligible formulation"?
I think that's just typical nerd pedantry. As someone who has worked in this space (I co-built the WebRTC bits of Twilio's browser voice product back in 2012, including building a server-side implementation of WebRTC when none existed [at least not in a form we could use], and bridged that to our internal SIP-based voice infrastructure), I'm vaguely sympathetic to the desire for precision. It is correct to say that you can't bridge "WebRTC" to "SIP"; as I'm sure you know after the work you've done, WebRTC is a RTP-based media stack mated with STUN/TURN that happens to use SDP to describe voice/video sessions, while SIP is a signaling protocol that doesn't really specify anything about media.
I'm not sure I'd agree that most developers and customers use the language you assert, but admittedly I've been out of that space for a good 7 years at this point. Regardless, I think it's valuable to understand -- for perhaps the hypothetical someone new coming into the space that you bring up -- what WebRTC and SIP actually are, and that you can even use SIP to set up WebRTC sessions without any sort of bridging to other infrastructure.
> The less charitable, but understandable interpretation probably runs along the lines of, "we've been doing this for a decade and it's frustrating that we haven't gotten enough recognition for it, to the point that the subject of this HN post seems to be striking people as quite unlike anything others have done before".
I think that's a fair digestion of the force behind the tenor of my comment, which I would otherwise agree is perhaps a bit harsh.
> I think that's just typical nerd pedantry [...] It is correct to say that you can't bridge "WebRTC" to "SIP"
It's a bit ironic to me because I come from a humanities background, leading to amusing - and sometimes infuriating - cultural clashes with nerds and their pedantry, since I'm very comfortable gesturing at commonsensical understandings of things and being what programmers would consider "vague". I often grouse about needless nerd pedantry myself.
Nevertheless, telecom is about rigourous and labouriously articulated standards, and always has been. I think you correctly discern a note of displeasure in the way that people with this kind of metaphysic see the socialisation of RTC technology into the web economy. It's a bit like how Node is seen by classical systems programmers to be a psy-op meant to convince JS UI developers that they can write robust backend code. It does the job well enough, but sometimes you have to actually know things.
Yes, in short, I think that conflating "WebRTC" and "SIP" in this manner, and popularising this crude understanding of the various parts and where they fit, does not so much further productive adoption and public understanding as hinder it. In the case of RTC in particular, where tolerances are lower and robustness is paramount, taking the shortest path to democratisation of the most general QuickStart Guide may not be the best thing to do. We see the consequences of this every day on the mailing lists of the various open-source building blocks previously mentioned, as well as others, like JsSIP and SIP.js.
For a moment I was thinking that perhaps this could be the foundation for something to replace Google Voice for some people. Unfortunately I am not sure that most SIP telephone numbers will be usable for things like account verification.
If I get a number via Twillio/$X is the receiver of the call able to tell? I haven't spent a lot of time with SIP and POTS stuff. All my time has been WebRTC and got into SIP for work.
Unfortunately, yes. After I ported my number from at&t to google voice, a lot of services refused to accept it. Requiring SMS 2FA with an non-VOIP number seems to be a common anti-spam measure these days. It's often required for new accounts
Yes, there are many APIs available to look up the carrier that services a phone number. You can sort these carriers into categories (landline, mobile, VOIP...) and many services won't accept the number for SMS OTP use if the carrier isn't a "real" mobile carrier, in a somewhat hamfisted effort to prevent fraud.
As a counterexample I long ago ported my landline phone number to a SIP provider that supports SMS and due to the phone number being baked into various accounts the family has, I know it works for verification at least for those services (one of which is a bank).
You figured out the right way to do it: porting an existing number. That is a good workaround for the easy way to do “service provider lookups” via an NPA NXX database like https://localcallingguide.com/lca_prefix.php. I suspect that if you go there and enter your area code and NXX you’ll see it listed as your original landline provider and not your SIP provider. When you go to provision a phone number from Twilio’s own pool, you’ll often see that all of the numbers come from a small number of NPA-NXX-xxxx blocks and those blocks are the ones that many 2FA and user auth services reject.
To get a little deeper into it, Twilio (in Canada at least) doesn’t often own the NPA-NXX block either. Around where I am, the blocks are generally owned by IrisTel, who is a SIP provider in their own right. An old client of mine that had a data residency/privacy issue (their client required all of their data to be processed in Canada) ended up provisioning some numbers directly with IrisTel and doing that integration using FreeSWITCH.
I ported my longtime cell number into GV a while back and have also noticed that it kept working everyplace where I ended up leaving it. I suspect they only run these checks upon the addition of the number, and not ever again.
I hate that the fraudsters make it so that we Can't Have Nice Things, but I also see why and if anything we need more ways to add costs (calibrated to be manageable to spend once, but costly if you get banned daily) for account creation in a lot of places.
I've encountered several services that demand a mobile number for verification. Google Voice numbers are rejected and surprisingly, so are landline numbers. Only numbers for mobile are accepted. It's just another case of how the tech world has outsources identity verification to the mobile telecom companies.
Companies that mandate use of another company offer a good reason to shun both companies, when there are independent competitors which prioritize customer relationships over "business partner" relationships.
SMS is woefully insecure for multi-factor authentication, when we have TOTP and other open standards that work with local-only password managers.
And not only that, most companies that involve SMS in their IDP make it a master key (a single-factor) -- if you can read one text, you can take over the whole account without even having the password. I keep waiting for this to change, but out of all my banks not one supports a proper TOTP.
It's really annoying, especially when they frequently then expect you to enable that cell number as not only a 2FA but really a 1FA (capable of resetting your password WITHOUT the password).
It's because it's super cheap and simple, though, and that's about it.
My impression is that SIP providers are usually not regular telephony companies. So by looking at who the provider is, it is often possible to determine that it is a SIP number and not a regular number. Which in turn might lead to the phone number you are paying for not being usable for account activation. Because websites will think that you are a spammer.
It used to be the wild west but FCC is tightening up (maybe even going too far, we'll see). STIR/SHAKEN and KYC (know your customer) rules are making it more expensive for providers to allow any traffic over their networks. Shady providers would look the other way at spammers pumping traffic (providers getting paid); shady providers mix legitimate traffic in so upstream carriers can't just block them, etc.
Now, there's more regulatory teeth to go after the shady providers allowing this traffic.
No, at least not without spinning up your own MVNO, registering it, and getting the phone numbers you care about ported over to it.
The provider information isn't encoded in the calls themselves, there's essentially a (number of) centralized databases that can be queried to get provider information, out of band.
Does anyone have any ideas on doing the opposite, using something like Asterisk to dial into WebRTC meetings or interact with WebRTC speech recognition services?
Would it make sense to run a bridge for Asterisk to call into? Most WebRTC services will have proprietary signaling, so you will have to write some signaling code.
```
Asterisk -> Bridge -> WebRTC Service
```
You could do it all in Asterisk you will just have to write a fair amount of C code! It's been years since I have done that though. Reach out on https://pion.ly/slack and would love to help :)
I know FreeSWITCH and Kamailio both have WebRTC connections available, so if you use a SIP client in the browser over WebRTC to the server, you can plug into telephony networks on the server side.
We use Kamailio's WebRTC implementation heavily in Kazoo along with our libwebphone client. The transport is abstracted so Kazoo deals with the device and its configs; the Kamailio instance the browser connects to does the TLS termination for WebRTC. FreeSWITCH has the smarts for the SDP DTLS bits. And it all just works real nice together.
It would be nice if we could revive tel: links. It should be possible to have tel: URL handler that popups about how to make the call. Using smartphone would be good option. I was thinking about desktop, but this would be perfect.
Or could add the logic from mobile browsers to recognize phone numbers and make them into links.
I work on software for p2p networking and I'm always researching strange things because of it. One thing that I've discovered is that SIP has a mechanism to relay a single message to a single IP. I know this might not sound like much but I think that theoretically it might offer a way to reach nodes behind symmetric NATs (which is the most restrictive type of NAT.) The idea is that you would have a list of N SIP servers who you send packets to on sip_ip:sip_port. This creates a stateful rule allowing back packets for the server in the router.
Now anyone is able to use that SIP server to relay messages to you so long as you listen on the right ports. This could be useful for highly restrictive NATs like symmetric NATs that only reuse external mappings if an inbound connection uses the same IP and port (more applicable for UDP.) If you can get one, just ONE message to a peer then you can use it to exchange information on strategies to connect directly to them. E.g. TCP hole punching.
There are a metric crap load of SIP servers out there and any one of them would effectively enable you to exchange information with a symmetric NAT and certain firewalls. I think I did basic tests for this ages ago between multiple Internet connections and it seemed to work. So I think this has potential in p2p networking and decentralized apps.
Session Border Controllers started out as a way to solve inerop between complex NAT environments and normalize SIP implementations which can and do vary widely across roll-your own Asterisk to vendor B's softswitch so just a friendly word of caution if you are considering any feature out of the SIP "standards" then the more obscure it is then the more you need to be sure you own every part of the chain from the user agent on down.