Hacker News new | comments | show | ask | jobs | submit login
TCP/IP over audio (anfractuosity.com)
137 points by deutronium on Feb 17, 2014 | hide | past | web | favorite | 67 comments

Remember the BBC micro and other computers of that era, which used audio tapes to save programs? I recall a BBC TV programme about technology from when I was a kid. In at least one episode, they played such a tape on the air, so that we could record it (by holding the cassette recorder near the TV speaker), and then load the program ourselves.

Is this a false memory or can someone else confirm seeing this?

Something similar was provided via radio (not TV): https://en.wikipedia.org/wiki/BASICODE

It's possible that the BBC broadcasted data in the same way (conceptually, even if not in format) on their TV programme.

It's probably true. The rate of data on the tapes was probably slow enough that even over a lossy TV broadcast signal it would still survive.

I think the Commodore Cassette drive was 50 bytes/sec?

I was able to copy Commodore cassettes using my father's high end audio setup. A friend with a standard two-tape deck ended up with corrupted tapes though, so the bit rate must have been close to the limit of the medium.

Your friend probably had a problem with azimuth (head alignment), which can usually be fixed with a small screwdriver:


Yea that is correct. I'm pretty Sure Micro Live did this.

As an aside, the tun/tap interface in Linux is a fantastic way to muck around with lower-level networking without getting into the kernel or hardware. Essentially, it creates a virtual network interface, except instead of hardware, data goes to your userspace program. You can then just do a read() to grab packets and do whatever you want with them, using Python, Ruby, or anything other programming language.

I wrote a prototype proxy in Go that can split network traffic over two Internet connections using some hacked-together tun code[0] and everything happens in userspace.

[0] https://github.com/erjiang/tuntuntun

Yeah, I'm trying to tunnel TCP packets over Facebook chat[0] (to get some free Internet when using "social media" mobile plans).

[0] https://github.com/matiasinsaurralde/facebook-tunnel

Interesting. Have you tried using a DNS tunnel? It would be similarly slow if it worked over your connection.

Yes, both ICMP and DNS tunnels work on most of the mobile plans here (I'm from Paraguay, South America). We had an interesting experience with the Internet.org campaign, they signed an alliance with the biggest telco (Tigo - Telecel, Millicom International), so all the users are getting free Facebook access. This small project I'm trying to code involves a critique to the Internet.org campaign ("free Internet for poor countries"), Internet != Facebook.

Openvpn is the king of all tun/tap applications. A gander at its code is a good way to get a serious look at implementing your own networking with tun/tap interfaces.

Neat project, I like the way I say tuntuntun in my head. Your abbreviation of "ttt" makes me think of Technojock's Turbo Toolkit... ya, I'm old :)

There have been several threads on HN about ultrasonic networking in recent months. One was a simple chat client, called Quietnet [1]. The other was malware called BadBIOS, which has the ability to communicate over hi-def audio [2].

[1] https://news.ycombinator.com/item?id=7024615

[2] https://news.ycombinator.com/item?id=6654663

The alleged ability to communicate over audio, since its very existence is questionable.

Does nobody here remember modems? You take an old 2400 baud modem that's supposed to work over a crackly telephone line - www.engadget.com/media/2009/05/090527-300baud-01.jpg

that's literally how TCP/IP was invented. This stuff - pure, analog audio - went all the way up to 28.8kbps, which was the limit of the line bandwidth. Why not just take those protocols and put them into a speaker and microphone. Why bother writing new protocols? Just start with modem protocols and play with the frequencies.

Home Internet connectivity was literally born out of TCP/IP over audio...

Communicating through copper wires using electrical signaling at audio frequencies is _very_ different from communicating through air using acoustic waves. Even assuming 'perfect' speakers and microphones you have completely different transmission effects than you have with low frequency electrical signals in copper.

Please do thorough research. Modems were originally, more often than not, used with acoustic couplers.

Exactly! And the DIY version of it looked like this:


And it supports 300bps and 1200bps speeds.

It's unclear what difference that makes in the context of my post. The fact remains that the modulation and signaling protocols developed for use in modems were designed within the constraints of electrical transmission across miles of copper wire, not through sounds waves in an unknown acoustical environment. The constraints for the two situations are entirely different. Acoustic couplers are designed to minimize any acoustical impact of their use to avoid the exact acoustical problems you'd have transmitting across a room.

Yes. Also, the upper limit on the dynamic range of the telephone network is somewhere around 3400 Hz, well within the range of human hearing. This protocol is designed to send data in a frequency range that most people can't hear, or at least can't hear very well.

When I was a child (probably over 20 years ago) I would watch my mother transfer text documents via the computer. The modem was a device that exactly fitted the telephone horn. So you had to pick up the phone, dial the number, then put the horn in the modem. I don't recall exactly but I think you could hear the computer "talk" to the other side.

I'm not sure how different that technique is from what is posted in this article.

Didn't some old modems use an acoustic coupler (i.e. a cradle which held the handset)?

Yes, they did. I've still got an old one in a cupboard somewhere.

The question is not "can you transmit digital information over audio". That's an obvious yes. The question is, can laptop speakers transmit at ultrasonic frequencies, and can a laptop microphone pick those frequencies up sufficiently to actually communicate.

This guy used 17 and 18 kHz as carriers. That is within the hearing range of young people, but a quick experiment shows that either my desktop speakers cannot reproduce 17 kHz, or I have recently become too old to hear 17 kHz.

> This guy used 17 and 18 kHz as carriers. That is within the hearing range of young people, but a quick experiment shows that either my desktop speakers cannot reproduce 17 kHz, or I have recently become too old to hear 17 kHz.

It's easy to find out -- go here:


Enter any frequency you want. Chances are if you're over 40, you can't hear 17 KHz. I'm 68 and I can't hear anything over 6 KHz.

> ... or I have recently become too old ...

It usually takes longer than that. :)

> Enter any frequency you want. Chances are if you're over 40, you can't hear 17 KHz.

(Maybe I should get better speakers...)

> (Maybe I should get better speakers...)

Maybe, but you can test your hearing range more easily by putting on a set of headphones and experimenting with the linked signal generator -- reasonable quality audio headphones should easily be able to exceed your own hearing range on the high end.

And chances are typical computer speakers should also be able to exceed a normal hearing range.

>or I have recently become too old to hear 17 kHz.

If you're in your mid-20s (and you're on HN, so you probably are), it's not terribly unlikely that you've recently lost that part of your range. In my experience (back when I was young enough to hear those frequencies), even crappy speakers can usually go up to 20kHz easily. Very low frequencies are another story.

do we have a age poll for hn users ?

I am almost positive that there was one a while back, but I'm having trouble finding it.

> The question is, can laptop speakers transmit at ultrasonic frequencies, and can a laptop microphone pick those frequencies up sufficiently to actually communicate.

Given two separate examples of running code, the question is really what percentage of laptops can do this. It's possible that some systems have filters but so far I haven't found one.

> Why bother writing new protocols? Just start with modem protocols and play with the frequencies.

Technically PPP is a point to point protocol which makes everything easier (no real addressing, no routing and so on). This protocol would be more comparable to wifi.

At the end of its service I had flashed my USRobotics Courier modem from 9600k to 56.6k. That was still audio over POTS.

56.6k speeds required a digital POTS, it transmitted PCM through the digital portions of the network only to become analog at the last moment, from the nearest pole switch to your house. In other words, the modem you were calling had to be 'all digital' and plugged into the telecom network. There wasn't a lot of connection left to 'audio' for these modems, other than their baseband frequency range (0-4khz).

You are mistaken. 56K baud was the 'last' of the POTS modems, and it worked over an unimproved POTS line. Work beyond 56K was not pursued because ISDN was 64K and "imminent" (only to be DSLAMed :-) by ADSL.)

The distinction between 56k and slower speeds was that 56k could not be transmitted over the digital part of the POTS network in its modem audio format. Telephone systems are generally analogue between the house and the exchange (or road-side cabinet), and then digital between that and the main telephone network. The digital encoding was 8000 samples per second, at eight bits per sample, with a logarithmic transfer function in order to reduce noise levels when the line is quiet. So, the digital part of a telephone line ran at 64kb/s.

In order to support 56k speeds, the exchange (or road-side box) had to be upgraded. Normal exchanges used a simple 8-bit DAC and ADC, but when a 56k modem was attached it would have to switch to a different mode. Effectively, the remote modem was moved from the ISP to the exchange, with the digital part of the POTS network transmitting de-modem-ed data, rather than a digital representation of an analogue signal encoding digital data, which is what happens with slower speeds.

So yes, the 56k signal was transmitted over an analogue line, but no the 56k signal was not transmitted over the main POTS network. Work beyond 56k was not pursued because the underlying digital telephone network could not actually transmit data at a higher rate.

This is also where the 64kb/s speed of an ADSL line comes from by the way. That just extends the digital part of the POTS network to the house. ADSL allows a slightly higher data rate (64kb/s rather than 56kb/s) because it has a side channel that carries the control signals such as dialling and flow control, whereas 56k modems had to send flow control and error correction in-band.

No need to take my word for it, here is the spec: https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-V.90...

Thanks for the link, want me to send you a Rockwell chip that negotiates 56K over a POTS line? I think I've got about a dozen left but don't have my line simulator anymore but when the company I was with was shipping our business gateway the modem chipset was qualified and homologized for 15 or so different markets. The line analyzer didn't have a digital bone in its body so I doubt the chip would have been able to negotiate anything other than an analog connection through it.

Of course the chip could lie about connecting at 56K I suppose.

I think your memory might be faulty, the consumer end was POTS, but the head-end needed upgrades from the phone company. If the rest of the links in the phone company were all digital then you could get 56k, if not you were stuck with 33.6 at best.

At the ISP I worked at back in the mid-late 90s, I definitely remember us having to upgrade the lines and buy a bunch of fancy new rack mounted modem equipment to support 56k. IIR, the equipment and lines could also handle single and dual channel ISDN connections, which we charged a pretty penny for. But the customer had to have a digital connection at their end. Also, IIR, the actual limit was 53k for some regulatory reason.

Here's some period reading






Of course I might be wrong, I've forgotten almost everything about the ISP business.

fun note: Before that we had built several racks of custom dial-in hardware out of unboxed external USR 33.6 modems, racked and fan cooled, with long power strips along both sides for power.

Troubleshooting was accomplished by turning up the volume on a suspect modem and waiting for somebody to dial in and listening to a couple fails (we couldn't dial into it directly due to how the incoming telephone calls were routed). A trip down to the local computer shop to buy another modem, crack the case open and voila, fixed.

We eventually were forced to sell the company with the advent of DSL, which, even if we could afford the head-end equipment (which we couldn't) it wouldn't put us in practical competition with the phone companies who were better positioned to offer the service.

It actually depended on the PSTN links.

The line from your customer to the central office could NOT be digital, because if it were, it meant you are most likely connected to a SLC (aka remote terminal).

The goal of a SLC was to serve a remote area at low cost, by avoiding the need to deploy tons of copper. Most SLCs achieved this by taking a T1 (traditionally 24 POTS lines) and performing an analog-to-digital conversion AND compression (instead of giving you 64KB of bandwidth per channel, you got 16KB) to increase the number of voice lines that could be served.

So instead of deploying 96 pairs of copper, for 10 miles, at a cost of $$$$$ per mile, you deployed 2-4 pairs of copper for 10 miles at $ per mile plus the fixed cost of $$ for the purchase/lease of a SLC.

(The downside is the occasional line break caused by a car accident or hungry squirrel, would take down an entire neighborhood instead of just one customer)

On the provider side, you needed a provider that would deliver you PRIs with echo cancellation turned OFF on your channels.

Echo cancellation helped telecom providers, again, by saving bandwidth. With the advent of packet-switched networks, your voice calls were converted to digital audio, and sliced into chunks of 64KB, and delivered as packets to the remote switch.

Silence suppression was a common feature of packet-switched networks -- instead of delivering 64KB packets of silence, you delivered nothing, which allowed providers to reduce the aggregate amount of bandwidth they needed to service all the calls transiting their network.

With echo cancellation, it further increased the amount of packets the provider could drop to the floor which reduced their overall bandwidth needs. Your provider most likely marketed this new technology as "better quality voice calls" (remember the AT&T commercials inviting you to test the quality of their 'new digital network' for yourself, by calling 10-10-288-something and listening to a clip of Whitney Houston singing?)

The problem is, echo cancellation generates false positives on modem calls, and the subsequent silence suppression would cause 56K calls to retrain down to a level that would eventually not be effected by the echo cans, resulting in a 31.2Kbps or 33.6Kbps call.

This was a highly annoying discovery for providers who invested in brand new 56K-capable modem equipment.

Normally, you could just ask your provider to turn off the echo cans, but after the Telecom Act of 1996, a flood of new providers entered the market who were merely just reselling other larger carriers. This was a problem as your reseller probably didn't know what echo cans were, let alone had the "pull" to ask the underlying provider to turn it off on your behalf.

As an ISP, if you couldn't get echo cans turned off, then your only option was to cancel and find a new provider.

mbell, please don't take this the wrong way, but you seem to be a young couch researcher. I used to think I was the fountain of all knowledge with my exemplary Google searching skills but cursory searching and skim reading doesn't reveal the wealth of knowledge and facts that are still locked up in engineers heads and dusty old text books.

I swear that I used to call another friends BBS that also had a Courier modem hooked up to a POTS line and connect at 56.6k.

That is unlikely, the relevant standard is V.90: http://en.wikipedia.org/wiki/V.90_(recommendation)

Only downstream was 56.6k PCM, upstream was 33.6k analog-modulated, so two consumer modems hooked together would only be capable of 33.6k.

There was a later V.92 standard that improved upstream speed, but is was a bit wonky as it sacrificed download speed to do it, I'm not sure how much it was really used.

#BadBIOS might be vapor but #RadBIOS is coming!


This is neat, but for a teenager, 18kHz is more "instant headache" than "ultrasonic".

Amateur radio folks have been doing this for a long time, since the early 1990s at least.

Radio is electromagnetic waves, not sound waves.

He's more correct than you think. Since the 90s there's been an explosion of sound card based digital modulation schemes. Usually you see a waterfall spectrum from 0 to "whatever" and you click where you want to operate. Essentially the attached SSB transceiver is a linear transverter from audio to some RF.

So instead of tuning your radio to 7.070 MHz and clicking on the computer waterfall at 1200 Hz (aka operating at 7.0712 MHz) you just unplug the radio from the computer, use speaker and mic instead of something plugged into speaker and mic jacks and click on 18 KHz or whatever.

In the early days of PSK-31 and other modes you did demos at radio club meetings and whatever by just letting it rip over the speakers. Loud and annoying but works pretty well across a room or further.

The main limitation is most SSB communications radios cut out around 3400 Hz at the high end so the software that is written for sound card digital modes cuts out somewhere above that, but sometimes not as high as 25 KHz or whatever, because 99% of the users and devs will never fool around with ultrasound.

Because I like to use a HF modulation mode called Olivia I mostly use an old version of DM780 software (from when it was free) and multipsk (free) although I could use FLDIGI (which is also free). In my infinite spare time I'll see what unnatural limits have been built into the software. This topic of ultrasound networking comes up on HN like clockwork every week or so, so I'll report back next time.

(Edited to add, one thing I like about multipsk is the, um, innovative UI. Go to images.google.com and take a look.)

Side question: UDP seems to be the preferred protocol for VoIP-type of services. While the size of a UDP message is generally much larger than that of TCP, and some missed UDP messages are acceptable (analogous to cellular communication some signal weakness/loss is acceptable), any other strong reasons people prefer UDP over TCP in VoIP?

What happens when packets are dropped in UDP? A bit of audio corruption.

What happens when packets are dropped in TCP? A hard-to-predict amount of delay. And retransmission of audio data that you might just throw out anyway. (There are alternatives to discarding the data that should have already been heard, like temporal compression, but that's hard to do right, too.)

With TCP, when you drop a packet, you wait for it to requeue, and the application gets no data. You can see the effects in some poorly implemented video conferencing systems. You'll see someone speaking, then it'll pause for a few seconds, then the system will play the recovered video really fast to catch up. It's quite annoying.

Instead, for realtime applications, you make sure each packet is usable by itself, and make your system gracefully degrade in face of packet loss.

(Or, if you're from the insane world that designed Fax-over-IP, you transmit 2 or 3 packets for each packet, and hope that staves off packet loss.)

UDP is used in a lot of environments where a few dropped packets aren't an issue, because the package guarantees in TCP slow things down, and TCP is generally a heavier protocol so it's slower... because... ?

I think that I actually don't know any more than you about it.

TCP gives you two guarantees over UDP:

* All packets arrive. (Reliable transport). * All packets that arrive are in order. (Sequential transport).

Unfortunately, both of these aren't so necessary for real time audio (and somewhat less so, video). In fact, they get in the way.

Missing packets can often be replaced in VoIP settings with various kinds of interpolation. And out of order data can be buffered and used to build the interpolated data before playback. Interestingly enough, even packets that have been damaged still often are usable to VoIP applications, since they still contain some payload that can be used to improve the signal.

In the old days, they used to call this a "modem". Long live the Hayes 1200 baud modem of the early 1980's.

Is there any step-by-step setup guide for dummies? I'm entirely new to gnuradio and have no idea how this is supposed to work. It seems very interesting though!

By the way, I'm using Linux Mint 15 and fairly experienced with Debian-based Linux systems.

I couldn't see any information on the linked page - anyone know what the latency and throughput characteristics of this set up are?

Would be very interesting to understand how audio attenuation impacts the TCP/IP connection.

I haven't done proper throughput tests with it yet.

I'll see if I can dig out a pcap file I took using it, which should give you an idea of the latencies involved.

[Edit] I've just updated the page with a wireshark screenshot which shows the latencies and also a link to the pcap file.

Something which I'm keen to look at doing is extending it beyond 2-FSK, to improve the throughput.

Definitely, to QQAM encoding and you should be able to get a decent baud rate out of it. I believe both QAM and QQAM are built into gnuradio as encoding options.

How is it that they receive and transmit ultrasonic frequencies with a computer sound card? Is there not a low-pass filter at around 22KHz on inputs and outputs of all sound cards?

I was able to transmit at up to around 23kHz which you may be able to make out the spike on the FFT graph on the video I made for some previous work: http://www.youtube.com/watch?v=H0DKRl8XIcU

To get up to 23kHz I used a sample rate of 48kHz.

> Is there not a low-pass filter at around 22KHz on inputs and outputs of all sound cards?

It's not a low-pass filter per se, the limit is imposed by the Nyquist–Shannon sampling theorem (http://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_samplin...), which essentially limits the frequency spectrum to 1/2 the clock rate. Good-quality audio from a sound card is clocked at 44 KHz, which means the audio range extends only to 22 KHz.

But you can clock a modern sound card at a rate much faster than 44 KHz, so the above might be only the normal limit, not the extreme.

The built-in audio on most PCs these days supports 96kHz or even 192kHz sample rates.

The DAC may sample at that resolution, but it's another matter if the driver will pass an ultrasonic frequency. Looking at a few datasheets from various manufacturers, I don't see any mention of bandpass filters. So either they don't have them or they don't think it's worth mentioning.


This to me is incredibly awesome. The question is, could you use radio frequencies meant for speech/music for it? I think so, since morse code is sent on these same frequencies. What about streaming video over short wave radio?

> What about streaming video over short wave radio?

That's what TV broadcasting does. But it's radio (electromagnetic) waves, not sound waves.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact