Hacker News new | comments | ask | show | jobs | submit login
Minimodem – General-purpose software audio FSK modem (2016) (whence.com)
157 points by mmastrac on Feb 2, 2018 | hide | past | web | favorite | 62 comments

I bought a laptop in the late 90's with a 'winmodem' and was sad to find it had no Linux support. I set out to add it.

I bought the Linux device drivers book and started researching winmodems. I discovered the low level driver interface was built in to the audio chip. I believe the standard was called 'AC97'. Borrowing code from the audio driver, I managed to activate the RJ11 jack, which I hooked a tiny speaker to, and played audio out the phone jack. I got as far as playing DTMF tones before giving up on the project. The details of how modems work at the modulation level was way over my head.

Today we have many projects with no Windows support, like this Minimodem. I couldn't even develop React or React Native apps on Windows until a few years after they had been out. I would love to be able to use Redis on Windows in production for corporate Intranet apps where they often run Windows only.

It's kind of annoying to me, but I don't blame the authors. I feel like many Linux users blamed Microsoft for things like this, but honestly - why should anyone go out of their way to support an OS that they don't use?

Surely there is a difference between the two. React is open source and you could have solved your problem yourslef or hired someone to do it. What can a Linux user do about proprietary hardware other than complain to the manufacturer?

Presumably because that OS supports you right back. If windows open-sourced a few key components, like directx, I'd be a lot more inclined to support it.

Yeah, I hear what you’re saying. I deploy my stuff to Linux, but putting aside any ideology, I just vastly prefer a Windows desktop.

Sorry about hearing you three days late, but I agree with you on this.... and so I decide to use windows, and the box shuts down automatically does an update, and I wait 20 minutes for it to apply it's patches,

And then I reboot into linux, and sudo apt-get upgrade, and remember why I use linux all the freakin time.

Also tangentially relevant: In amateur radio, soundcard "A"FSK (A for audio) is super popular. The most popular are the time-synchronous WSJT-X modes (JT65, JT9, FT8, etc). It's super fascinating from a technical robustness point of view, especially the FT8 mode. The program is authored by a Nobel laureate in physics, and a very active radio ham, K1JT.


Tune to any of the FT8 frequencies on http://www.qsl.net/sv1grb/psk31.htm using almost any receiver on http://websdr.org/ - 20m in the daytime, and 40m during evenings is usually full of activity.

There are dozens of other time-asynchronous modulations supported by fldigi, like the 2nd popular PSK31 and RTTY. And morse code too :) http://www.w1hkj.com/

Just a small nit, most of those are FSK but PSK31 and it's variants are actually Phase Shift Keying[1] which is pretty cool in its own right. Rather than changing frequency it's a shift of phases when combined with a 90' signal(often called quadrature, hence I/Q signal you see in SDRs) you can encode more than one bit per symbol.

Either way it's pretty cool stuff.

[1] https://en.wikipedia.org/wiki/Phase-shift_keying

Also forgot to mention it but an even wider use of AFSK than FT-8 and it's WSJT cousins is AX.25/APRS over VHF. Sadly it's not nearly as advanced as FT-8 since it most commonly uses 202 Bell tones and a blazing 1200bps. There's been other digital formats but the whole situation is pretty fragmented.

The infrastructure side of APRS is pretty nifty, you have sites like aprs.fi[1] that let you see the current state of the world in near realtime anywhere an I-Gate is present.

[1] http://aprs.fi/

Every time I see websdr.org mentioned, I supply this caution:

I just want to caution folks about visiting this site. You could find yourself getting sucked into a years-long obsession with RF and software defined radios. Pretty soon you'll have dozens of RTL SDRs strewn about your desk, coax tumbleweed snaring your feet and six iterations of a quarter-wave ground plane that still don't work quite as well as the first one you ever built.

Only tangentially relevant, but one of my favorite images on the internet: an annotated frequency domain visualization of a modem handshake by Oona Raisanen


That is such a great diagram. Pretty awesome seeing the split tones in DTMF, frequency probing and negotiation.

Steve Underwood has a FAX softmodem called spandsp[1]. Dude knows a lot about but signal processing which is a topic way beyond my wizardry level.

Many years ago I've used it successfully for a FAX server made with Asterisk and cheap telephony cards. Since a FAX is just a TIFF file encoded over sound I think this may be useful for someone that wants to transfer a binary blob over sound.

[1] https://www.soft-switch.org/spandsp-modules.html

It's a different image format described in the ITU T.43 spec, but it's fairly trivial to convert to TIFF.

You really, really want to play with this for yourself, if just for two minutes.

It takes <5 seconds to compile from source. However, I'd recommend you

  $ ./configure --without-pulseaudio
as PA really, really chews CPU when used as the audio pipeline for this. However, if your soundcard isn't set up for (or capable of) loopback audio, you might need to use PA.

From there, it's just a simple case of

  $ cat walloftext.doc | ./minimodem --tx --tx-carrier 1200
in one terminal while you

  $ ./minimodem --rx 1200 -q
in the other. (Start the receiver first.)

My (very) old laptop seems to be able to go up to 9600 baud (sans PA); beyond 10000 it completely falls apart, with literal (!) modem line noise in the data (looks hilarious), ALSA underruns on the transmit end and 100% CPU usage between the two processes. (Hardware from != 2006 will probably be able to go much much higher though.)

This is an extremely simple modem implementation: it's just stdio<->DSP, with nothing else on top. And of course it's not bidirectional, so you need one process for input and another for output. You'd need two copies each running on both sides of a link for full duplex. And then you'd need a tunnel interface on top of that to get a PHY, and probably PPP over the top of that for low-level retransmission and flow control. (The major additional missing piece would have to be that real modems "train" to find the best baud rate and audio characteristics to use for the link. This of course has none of that. You have to figure out the best the best link speed yourself.)

Having said all that... I just learned how fast 9600 baud really is. That was absolutely awesome.

I also discovered that 600 baud is kind of hypnotic to listen to and watch. lol

And now I'm wondering about socat...

Even for a decade old computer 9.6k feels surprisingly low, considering that we were pushing 56k on softmodems far earlier.

I wrote a little GNU Radio patch and flow graph to transmit TCP/IP over audio, not sure if it still works though as I haven't tested it on recent versions - https://www.anfractuosity.com/projects/ultrasound-networking...

Would this work between two modern laptops a few feet apart using built in mic and speakers, assuming some ambient noise?

Yes, me and a friend from the robotics club at school tried it the other year (everyone else in the room was pretty annoyed) it works but it's very finicky:

Your sound card is going to work best in some small band of frequencies (around 3khz if I remember) the mark and space tones have to be pretty far apart and the volume has to be turned up pretty high.

Turning the baud rate up increases your error rate so unless you're including error correction in your transmission that has to be pretty low.

My guess is that the soundcard works pretty uniformly well between 100Hz and 15kHz. But the speakers and microphones might not.

Yup, you can run a speaker cable between mic/line-in and speaker/line-out to pretty good effect. When I'm working on radio protocols I usually use this if I don't feel like setting up a pair of radios.

If you want to try that with a different modem in your browser, check out https://quiet.github.io/quiet-js


Inquiring minds want to know: can this thing run FidoNet over VoIP?

There is a demo video where someone dials out on a VoIP line @ 300 baud using Bell103 modulation. Apparently most modems, when they sense the carrier for that modulation, will skip negotiation and go right into 300 baud mode.

The aggressive compression on the VoIP line and lack of compression does cause problems.

The frequency is adjustable. I was just now playing around with 2 bps transmissions.

Surviving a vocoder is harder, but generally if you crank down the bitrate enough, it will probably work.

See also https://github.com/quiet/quiet

Not as much oriented towards standard modes, but if you are working on custom applications then quiet can be useful

And there's a browser demo too :)


HAHA wooow. I used this to make tones to allow asterisk to control an emergency alert system from the late 80s early 90s. This was in 2013 I believe.

Well, the first question that comes to my mind, and I'm not sure if it is the right question, is how can I use Minimodem for side channel analysis? Reading about it gives me a faint idea that I can carry out a timing attack / acoustic cryptanalysis / electromagnetic attack. Would anyone here knowledgable shed more light on my line of thinking, please?

I did a brief armchair check of the patent dates related to v.34 and I think they should all be expired (from the dates on the ISO site) - so connecting at up to 33.600 bidirectionally on a robbed-bit-signalling DS0 line should be implementable in software legally ... I think Hylafax or a similar project has a library implementing these...

Can you get it to transfer data ultrasonically?

Computer sound devices tend to have a roll-off filter just beneath the Nyquist limit of the DAC. So, higher-end audio devices that can sample at 96 KHz or higher could presumably do so.

But, then, you'd need speakers that can reproduce it and a microphone that can pick it up, too. Also, reflections are surely a problem at higher rates. The examples shown are well within the range of human hearing (but, old modems had to work within the bandpass range allowed by phone networks, which was rolled off at something like 8 KHz).

It depends on what you mean by "ultrasonically". You can quite easily send and recieve 19 khz tone using ordinary 44.1khz equipment, and virtually nobody is going to hear it.

Young folks and dogs would probably be annoyed. I could hear up to about 19k into my 20s. So, depending on your use case, it might not be workable. I also used to get headaches and my ears would feel fatigued from a CRT monitor that squealed at a frequency I couldn't "hear" (but could detect by other means), so I suspect those frequencies do interact with our ears and brains in ways that we can't "hear" in the usual sense. (Though I also know that there's a lot of woo around this issue in audiophile circles...I think anything beyond about 22k can be considered safely out of range of human hearing. Our ears just don't have receivers for those frequencies.)

I'm trying to think of use cases for ultrasonic data transmission, and they're mostly nefarious. Traversing an air gap, for instance. Higher frequencies are difficult to transmit over distances and through walls, so you'd need to get your receiver into the same room or make it a very high SPL.

I have a wifi camera that uses bleeps and whistles and warbles from a phone or tablet to configure itself before it gets on the network, which is pretty neat but seems weird in a modern world. But, I guess that's something that could be done ultrasonically to make it seem more like magic. But, since it is already pretty error-prone, I would bet pushing tablets and phones (including crappy ones) to accurately produce very high frequencies at a volume sufficient to program a nearby device wouldn't be worth the added magicalness.

Lots of devices which sample at 96kHz still have a 20kHz roll off filter :-(

I read that some recording studios were filtering to ~40 kHz, on the premise that the ultrasonics would generate audible intermods in the ear. Maybe it’s audiophile nonsense.

In any case, the 96 kHz sampling would yield an additional 3 dB dynamic range vs the 44 kHz sample rate.

I happen to have a small recording studio. I just looked up specs, my speakers will do 40khz, but my mics cut out past 20khz. I'd imagine even if I had a mic that could pick it up, audio drivers might also cut it out.

Could the reflections be used to your advantage, like with MIMO on wifi?

Around the turn of the century when PSK31 data modulation was new to ham radio operators, it was a very standard demo at club meetings or whatever to fire up two laptops using mic/speakers instead of radios and talk a couple feet which was considered impressive because most other modulation methods are wider band and will not work well.

The relevance to your question, is hams being hams, they gotta experiment, turn of the century laptop hardware often had codecs that operated up to 96 KHz or whatever, but speakers usually cut off somewhere around the high level of human hearing, so operating at 10 KHz worked and might be silent to old people with partial hearing loss, but you could forget about 25 KHz even if the signal looked great on an oscilloscope because the speakers and mics were not broadband enough. Very few customers care about audio quality in frequency bands they can't hear.

The same hardware hooked up to a simple I/Q software defined radio works fine up to the limits imposed by Nyquist and the claimed hardware sample rate, so its obviously limited by mic/speaker not the A/D converters.

This topic comes up repeatedly on HN. If HN had a wiki this meme would be a strange attractor for the HN community.

I could see "there's a message being transferred, and it's just outside of our ability to sense it" being catnip to the kid in all of us. Like invisible ink handwritten notes for adults. I looked up specs, my mic doesn't go beyond 20KHz. Do clubs like this still exist? I get the sense that I wasn't alive when they were most popular.

Probably not, but what you could try to do is to transfer data in such a way that people would ignore the sounds because they thought the sounds were not data being transmitted.

Make it sound like birdsong in the presence of other birds, make it sound like fragments of human speech when there are people present, make it sound like fan noise in a DC and so on.

That is a sexy idea. I feel bad that my second thought was "but likely has no commercial application". Now I wonder when I got old... I'll be writing an audio engine in the next few months; I'll try hacking together something like this then :D.

Very cool :) Please let me know if you get it working.

And then you record outside and find out that the birds are gossiping about us humans all along!

Sounds like you're discussing the hypotheticals of https://en.wikipedia.org/wiki/BadBIOS

If you want to try ultrasonic data transfer from your browser, check out https://quiet.github.io/quiet-js

quietnet does that (near-ultrasonic) (https://github.com/Katee/quietnet)

This is very awesome... I was looking for something to decode SAME codes a while back.

Do any melodic/rhythmic transmission protocols exist?

If you count 10MHz and up as rhythmic then yes! But kidding aside it would be really nice to have a musically encoded version of TCP/IP that sounds nice to the ear.

You could start by picking a major key and then dividing up all bytes to be transmitted into groups of three bits (have fun with the shifting...), that way all 8 tones get to be played, and since it will be monotone it should not sound too bad. Major points if you can do this with chords that sound harmonious.

Combining Zipf distribution, Levenshtein distance, and chords/scales -- might lead to something accommodating with aesthetic qualities.

Olivia sounds kind of nice. See the audio samples on wikipedia.


Beautiful. That sounds like alien birdsong.

This is so awesome. I can’t wait for someone to add MNP5 (should be feasible).

If you want to play with it on OS X you can `brew install minimodem`.

Is this the first step toward restoring fax functionality to OSX?

I wanted to add a faxmodem to my Mac a couple of years ago and everything I read on the internet said that telephone modems of any kind are no longer supported on OSX.

I wonder if this could be used to write C64 tapes?

There's already a tool for that: http://wav-prg.sourceforge.net/ - also works with the ancient mini-jack-to-cassette adapters.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact