Hacker News new | past | comments | ask | show | jobs | submit login
Quiet.js – Transmit and receive data in the browser at 44.1kHz (quiet.github.io)
315 points by jonbaer 17 days ago | hide | past | web | favorite | 91 comments

This seems like a good way of doing data exfiltration from a secured/monitored computer. Just get the source from git and run it locally in the browser, then take whatever file you want to extract, zip it, and then encode it in base64, using this to copy it to your personal phone. I bet the Goldman Sachs programmer who went to prison for uploading his trading code to an FTP server wishes he had been more clever like this.

Personal electronics are off limits in environments where exfiltration is a concern. You put your personal devices into a metal cell at the entrance, and walk through a metal scanner, airport-style.

I've experienced this in a couple of factories that build consumer-grade devices; I can imagine this to be even stricter on places where stakes are higher.

Maybe for military bases and such, but I was thinking more in the world of finance, where computers are pretty locked down to prevent people from taking valuable code for predicting stock prices or high frequency trading. In those cases, people most certainly have their phones with them, but the workstations have USB disabled, the network is closely monitored (with nearly all file sharing sites blocked completely), and anything suspicious will show up in logs.

Friends of mine who work in high-security finance have told me about some fairly exotic things. Desoldering USB ports. Epoxying cable connectors into their sockets (to prevent easy removal - without chance of significant cable/device damage leading to investigation of issue).

I can only imagine the stuff that goes on at places like the NSA and CIA where the stakes are extremely high.

Isn't it easier to store all the computers in a secure room and then just have cable runs to the peripherals?

This gets rid of most physical tampering attempts. The only one left is that the employee could cut the cord of the mouse and jerry-rig a small USB port. You could get around this by forcing the use of PS/2 connectors.

And then people would ex-filtrate data by toggling the numlock light programmatically.

Imagine being the guy who has to fix those computers when they break and finding only solutions like "just plug in this USB stick and boot from it to run this fix utility..." or the everpresent assumption that the machine is connected to the Internet 24/7.

You don't fix them.

You bring in an entirely new machine, and the old one is securely destroyed.

Oh yeah, like getting a new machine through purchasing and certified for the SCIF is a walk in the park...

You have a stock of prepared, vetted replacement machines.

You can replenish it slowly after you have just handed over a replacement.

Maybe they should be, but they definitely aren't in most places. Maybe some hedge funds ban phones but most banks don't. And having worked in several SCIFs for TS material, phones were banned but there never were scanners.

I was going to post about how there was a whole thing a few years back about how people suspected there was some malware using audio frequencies out of the hearing range to try to circumvent airgapped systems.[1]

Then I did a Google search and found that it's much more common now, which academic papers, actually developed malware and security software, and Blackhat talks on it.

So, yes, audio is used in data exfiltration, or at a minimum it's a known threat vector.

1: https://en.wikipedia.org/wiki/BadBIOS

Data loss prevention programs should be able to see you encrypting and encoding the file. Symantec’s website states it can detect someone encrypting a zip file or using PGP.

It’s a cool thought experiment. Just do it on your coworkers computer :-)

But there's a lot easier ways to exfiltrate I'd imagine. Encrypting data and using the web for transferring. This would have the effects of obfuscation, deniability and just simple data transfer.

Graphics cards are also massively high bandwidth devices... You'd need to figure out how to encode a digital signal and hook up to the wire. Or just record video off the screen.

export data as a video of a shifting qr code

sounds like a cool project

I actually wrote an implementation in Python of this exact idea:


There are all sorts of things that seem clever, until you think about the fact that, although nobody is paying attention now, once the stuff hits the fan, there will be all sorts of records that you were reading this article and posting in this thread and so on. And on the destination device, etc.

You could just use your phone to listen to the cpu to extract whatever you want.


Just do a fast/video QR code application. You can upload through the screen, one QR code at a time.

Their desktop session is video recorded and ocr is used for text matching of DLP.

When I was 12 or 13 I wrote a program that transmitted data over speaker and microphone. It would generate a 90ms tone for each byte and a 10ms silence.

It wasn't elegant, and most of the code was stolen from Planet Source Code, but it worked mostly.

What did it sound like?

For the most part like a less dynamic modem. Less EEEE-OOOO-SCRSHHHH and more EEE-EEE-EEE it had a base tone of something middle-c like (can't recall, it was 20 years ago now) add just added the byte value with a modifier on top of the base. So for long stretches of it it was just to my ear pretty much the same sound, especially when transmitting ascii text.

On a related note, even if the user has disabled speakers or muted the website, you can still make it generate noises: https://thume.ca/screentunes/

This is soo cool. Thanks for sharing. I have a pretty large monitor and it is easily audible when full screen.

Is that page supposed to produce a speaker noise? It doesn't produce anything audible here (standard Firefox on Windows 10)...

> On some LCD monitors this page will cause the screen to emit a tone that varies in pitch with the bar height. Maximize window for best results.

Looks like any noise is the by-product of the electronics within the device, rather than anything playing through the speakers. Coil whine (or something to that effect), I imagine.

Yes, it works on my monitor.

I wasn't aware that different screens necessarily produced the same sounds, but I noticed many years ago with CRTs that there was a faint sound that changed depending in a general way on what was displayed.

The noise should be generated by your LCD screen and is usually faint but exant. The larger the area it occupies the louder it is, so it's recommended to maximize it. You might also have an LCD screen that doesn't emit sound, but I don't know the details.

Missed opporunity to call this "LCD Soundsystem"

I'm working in a project for my local fire station to receive GPS data via VHF radio (widely used by emergency services in my country). We are trying with DTMF, this seems interesting.

There is a lot of prior art here, the most relevant of which is probably APRS: https://en.wikipedia.org/wiki/Automatic_Packet_Reporting_Sys...

Newer digital modes do much better than this older AFSK system, but FSK does work pretty well. Some newer things to look into:


FT8: https://en.wikipedia.org/wiki/WSJT_(amateur_radio_software)

The more interesting modes (WSPR, FT8) are designed for high-loss channels, so have a lot of error correction. For example, I can use 5W of power, indoors, to communicate across the country with it. For high-bandwidth VHF channels, it doesn't matter what you use really. The goal would be to find something that already exists and is easy to integrate, and there are a lot of options.

Don't they already do this? I hear them all the time on my scanner? Hell they use a four digit unencrypted code to activate the emergency sirens.

I thought that fire trucks had a switch to activate the sirens by the drivers; I had no idea they were activated remotely by radio.

I was referring to the standalone towers in the Midwest. Maybe elsewhere. When I was a kid i would wake up on Saturdays at noon due to the siren. My dad also had a police scanner I could hear. One summer I put 1 and 1 together

All you need is a dtmf decoder and a small transmitter to set the off those large sirens on high poles throughout that sprinkle the country in small towns and probably large cities

This worked to transmit "hello world" between my desktop (firefox) and mobile (firefox). Great project :) Of course you need a relatively silent room. This could have many nice applications

Share your results if you tested this! :)

(I have no relation to the library)

I like the idea but I think it would be more useful it it was more reliable. Ignore all types of speed requirements and make it super reliable then it could actually be useful in special circumstances.

Receiving the audible sound did not work at all in any browser.

Receiving the ultrasonic (that is not very ultrasonic at all clearly hearable) did work in chrome and firefox.

Sometimes it figured out that there had been a message but not what it was. (saying missing packets)

Sending worked from all browsers.

When I lowered my mba volume to just two bars it still worked but I could no longer hear anything

audible text worked right away from iphone to my webcam. ultrasound had some packet loss , it had to be really close to the cam.

That's interesting. I experienced the opposite with my iPhone and AT2020 microphone. Ultrasonic always worked perfectly, audible complained of packet loss.

I am not sure what the noise floor looks like when comparing ultrasound to the normal audible range. It does seem unlikely to me that all this street noise would only be in the audible range and not ultrasonic, but since my ears can't hear ultrasound, I have no idea. Ultrasound did work better.

I tried the image as well; no idea if it worked. I got tired of waiting.

probably has to do with the webcam, mine was a microsoft HD cam

Tested sending text via ultrasonic from Chrome on a Mac to Chrome on an Android phone and it works as advertised (didn't test image tranmission).

Could this work as validation of geo positioning ? Instead of having to scan a specifically geo placed qr code (client side), having a web app that listen some encoded ultrasonic passphrase could match the feature

Edit : no safari support :(

I believe this is already used in retails stores, and passively listening app libraries are able to determine the user’s location.

I believe it is also used for television ads.

Do you know of any commercially available and reliable implementations of this sort of ultrasonic beacon that a startup could use?

I have a dream of a product I want to develop someday. I like to go to bars and do karaoke. But I would love to be able to open an app on my phone and submit the songs that I want to sing, then get an ETA for my next turn. To prevent abuse, the system should somehow validate that the user is in the venue. Having the user point their phone at a screen displaying a QR code would be problematic though, especially for people like me (I'm legally blind). Would an ultrasonic beacon work reliably, even in a noisy karaoke bar? Maybe a Bluetooth beacon or beacons would be better, but that would be more hardware for the venue or mobile karaoke DJ to buy and set up. Yes, I know, all of this is totally a first-world problem; I should focus on things that really matter.

I think this is a good idea. I agree with you about the QR code being problematic, simply because some bars are extremely crowded, and it would be difficult to ‘find’ the QR code.

Also, anybody could take a picture of the QR code and re-share it.

QR code gets a tad more secure if you couple with geolocation prompt.


So, a modem?

Yes, but in a browser ! :p

It reminds me the quote: "Everything that can be implemented in javascript will be implemented in javascript." And this implementation is actually pretty neat.

A modem _implemented in a browser_ for any device with a microphone and speaker (and browser), yes.

I'm really enjoying trying to think of good use cases for this, tbh

A PWA that picks up and beacons that emit local area "off-internet" websites wherever you go?

I was thinking along similar lines. A neighborhood sneaker-net backed by something like gun https://github.com/amark/gun

THIS WOULD BE SO COOL!!!!!! please email me :D :D :D

Reminds me of Clinkle: https://en.wikipedia.org/wiki/Clinkle

It works, but I can even hear the ultrasonic variant. I guess it's more like vibrations from the surrounding case than the speaker sound itself.

I suspect that's more due to aliasing or some other DSP that's happening (most audio designs have bandpass filters, because anything outside the usual range is not considered a valid signal.)

While this is true, 19k is also hearable to most young people (maybe under 25 or so). We use 44k because it is right above double the cutoff of human hearing.

Using above half sampling rate will cause audible artifacts from aliasing. I guess you sampling rate is 44k...

Only exotic, multi-thousand dollar speakers will reproduce 44k, and for that you need a 96k sampling rate. By using a sample rate of 44k you can theoretically only reproduce up to 22k.

You may have misunderstood my comment. 19k/20k is right at the limit of human perception for people with extremely good ears. Thus, "we use 44k because it is right above double the cutoff," ie 22k.

As an aside, you will see 96k used (or more commonly 48k), but mostly for production work where the extra information could have value (comparable to using RAW for images, even though humans can't perceive all the information "natively").

Probably because while the signal is carried at 19 kHz, its bandwidth is wider. Assuming it's 6 kHz wide would make it 16-22 kHz. You can't send data that is just 19 kHz because it's just a static sine wave.

The ultrasonic variant is clearly not ultrasonic. Even a 80 year old will be able to hear that.

I used to have a 300 baud modem, which was used to connect computers over the phone (BBSs). It was fun to listen to and when I upped it to 1200 baud, it was fun to hear the difference from white noise to the more distinct beeps in the 300 baud.

I think it's neat to experience data in different forms, audible, tangible, and maybe aromatic (smell-o-vision) data?

I wonder what the bandwidth limit would be if you took 44.1kHz audio transmission to the extreme. Something like 56k6 modems?

Probably quite a bit higher: POTS was band-limited to something like 300-3000 Hz, which covers speech pretty well but not much else (hence the tinny hold music).

If this is true, then a brief 30s audio signal would yield more then a A4 photo capture of high density visual encoding, like microsoft hccb barcodes, interesting

Raw audio files are pretty massive.

A .wav file is 16 bits/sample x 44,100 samples/sec ≈ 700 kbps. Thirty seconds of that is about 2.5 Mb, which is a reasonable size for a photo. However, you wouldn't be able to send nearly that much data, since you'll be limited by various kinds of noise.

But wouldn't POTS imply a far less noisy environment once filtered?

It's a tradeoff.

You're now nyquest-limited to sending fewer symbols per second (baud), but you might be able to use a larger "vocabulary" of symbols.

This was one of the big changes in modem design. The first 300 baud modems used one 1 bit/symbol, but the V34 modems used a bigger symbol set that could send 6-10 bits/symbol (and also sent the symbols faster).

This is a good example of attacking digital machine without user knowing.

A keylogger in ultrasonic range.

Silent and effective.

This technique works at OS level and browser level.

And nobody can protect from these kind of attacks. Unless staying away from digital machines.

Just an example of why digital growth is bad unless and until humanity live in harmony.

This cool hack deserves the name AirDrop much more than the AirDrop feature created by Apple. No bluetooth or other radio transmitted standard used, and no infrared used either. Hey, I see lots of supercool social datings apps using this hack.

A smart display stuff like Chromecast and probably airplay use this.

I remember Google had an extension that did something similar like 5-10 years ago... not sure what happened to that (obviously probably Killed By Google).

Otherwise, this is definitely cool! Could be used like airdop.

I believe that extension would be https://chrome.google.com/webstore/detail/google-tone/nnckeh... (last updated 3 years ago). Not sure if that was ever something official or just an experiment.

There's also (the non-Google) chirp.io that started around the same time.

Nice! What about implementing tcp/ip on top of that and running decentralized protocols like ipfs? Lots use cases. But the bandwidth is probably too limited.

Has anyone figured out the throughput of this yet?

At max 44kilobits per second, or 5.5KB/second.

The signal isn't 44kbps because it isn't 44kHz. No speaker can reproduce 44kHz.

Could do with a dog frightener sample. Some people in our wework are allergic to dogs, so this would help them.

Would this work even with crappy laptop speakers or would you need some proper hardware?

This transmission method should work with most speakers, chirp.io has tested it with greeting card speakers.


That's because "greeting card speakers" are piezos, and have a much higher frequency response curve than even normal tweeters. On the other hand, they also have no bass.

What are some real applications for this?

Google uses this kind of Authentication for Chromecast when a User isn't connected to the same Network as the Chromecast is. This way you can still send Videos. It's called Guest Mode or something and is Opt-in

Zoom does some kind of ultrasonic sound in Zoom Room AV systems to automagically pair your computer to the room. Unfortunately I can hear it (though none of my coworkers can) as a periodic high-pitched chirping.

Streaming high quality audio files

I loved it, very cool

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact