
Making high-fidelity audio sound like it came through the phone (2018) - jonluca
https://blog.jonlu.ca/posts/phone-calls
======
noizejoy
As an alternative to Fourier Transformations, one can also use classic audio
filters applying high pass and low pass filters, or a band pass filter (or
similar EQ settings with the aim of boosting the signal around 2KHz, and
silencing the signal below about 1KHz and above 4KHz.

Caveat: I love doing this in a musical context for a specific part (e.g.
vocals or drums or guitar) amongst many, and so I tend to fine-tune the exact
frequencies in the context of the other parts so they evoke the memory and
emotion of a telephone line more than the exact specifications of a POTS[0]
line.

[0] POTS = plain old telephone service or plain ordinary telephone service

~~~
TheOtherHobbes
Simple audio filters will have a fairly gentle roll-off above/below their
cutoff. Zeroing FFT bins will brickwall the frequencies that are being
removed. They sound subtly different.

You can build brickwall emulated filters but they're never perfect, and you
have to make a tradeoff between high latency and phase distortions/ringing.

Personally I like FFT filtering because subjectively it can sound more
"vintage" than filter emulations.

It's interesting that this is being used to emulate a mechanical limitation -
the limited mechanical frequency range of carbon granule microphones and small
speakers.

Modern speakers have an amazing ability to record and generate a much wider
frequency range from much smaller transducers.

~~~
kragen
Depends on what you mean by “simple”. If you want a circuit you can build with
one op-amp and some passives from some coefficients you have memorized,
without simulating it first, you're probably going to go with a second order
Butterworth or something and get that gentle roll-off you're talking about.
But if you're using SciPy or Octave, you can synthesize a tenth-order elliptic
IIR filter and apply it in three lines of code (to me that's “simple”), and
it'll be pretty brickwallish.

It's true that you get more latency as you move to lower-phase-distortion IIR
and zero-phase-distortion FIR filters. But that's not a reason to filter in
the Fourier domain instead, because in the Fourier domain you get _all the
latency_. If you can pay _all the latency_ that way, you can eliminate the IIR
phase distortion with filtfilt.

Similarly, ringing in your IIR filter is not a reason to go to the Fourier
domain and start bashing bins to zero, because you get _all the ringing_ that
way: the Fourier-domain difference between your original signal and the
transformed signal is a Fourier-domain impulse, whose inverse Fourier
transform is a sinusoid that fills eternity.

Also, just to be clear, and you probably already know this, but ringing and
phase distortion are two totally different phenomena.

------
pjc50
AMR is a very "cheap" codec to run, if you want something to sound like it's
been over the GSM phone system you could just run it through AMR and back
again. Possible patent encumberance but implementations are available in
ffmpeg.

------
AstralStorm
The "too good" part is due to missing compander - very similar or identical to
ADPCM compander for better analog lines. ADPCM itself in form of G.711 is
still used in older digital phone exchanges.

~~~
philpem
muLAW and aLAW are also worth considering, and give a slightly different
effect to ADPCM.

The LPC and CVSD series vocoders also have a very unique(ly lossy) sound
profile.

~~~
kragen
A fun thing with LPC is the lpc10 codec built into Sox, making it trivially
easy to play with and get a very pleasing distortion.

------
grabbalacious
Coming soon, AI prediction of what untelephonic voices would have sounded
like! Starting with a detelephonised _One Night in Bangkok._

[https://www.youtube.com/watch?v=rgc_LRjlbTU](https://www.youtube.com/watch?v=rgc_LRjlbTU)

~~~
degenerate
Hmm... you may be on to something. In the recommended feed was Toto's "Africa"
remastered with AI:
[https://www.youtube.com/watch?v=AqtBdKP_FPs](https://www.youtube.com/watch?v=AqtBdKP_FPs)

The lyrics do indeed sound crisp as if they were re-recorded, which is
impossible. Anyone know if this was really done with AI, and what this field
is called (music restoration?)

The video is uploaded on some random spanish channel and I can't find any
information about it by searching.

------
bravoetch
Pots chops the frequency off below 300hz and above 3300hz. The simulation
should be almost perfect - but it doesn't play.

~~~
luismarques
It also happened to me. It was a Firefox compatibility issue, it played with
Chrome.

------
porjo
Only the first sound file plays for me!? Using Firefox on desktop and mobile.

~~~
scalableUnicon
From the console:

> Media resource
> [https://blog.jonlu.ca/assets/eightkhz_resampled.wav](https://blog.jonlu.ca/assets/eightkhz_resampled.wav)
> could not be decoded, error: Error Code: NS_ERROR_DOM_MEDIA_METADATA_ERR
> (0x806e0006)

You can download and play those audios directly:
[https://blog.jonlu.ca/assets/eightkhz_resampled_unfrequencie...](https://blog.jonlu.ca/assets/eightkhz_resampled_unfrequencied.wav)
and
[https://blog.jonlu.ca/assets/eightkhz_resampled_unfrequencie...](https://blog.jonlu.ca/assets/eightkhz_resampled_unfrequencied.wav)

------
gwbas1c
The phone system has been mostly digital "forever." Pretty much by the 1990s,
even if you had an analog copper phone line, it was digital by the time it
made it to the switchboard.

------
ThePowerOfFuet
> EIGHT_KHZ = 8096

Was this deliberate (vs 8192)?

~~~
Geee
Sampling rates don't need to be powers of two. Usually used sampling rates are
e.g. 22050 Hz or 44100 Hz. Maximum signal frequency is half of the sampling
rate, and sampling higher frequencies cause aliasing if they're not cut off
first. So, 8096 Hz can sample up to to 4048 Hz without aliasing. If their cut-
off filter is at 4000Hz, they're on the safe side by sampling a bit more,
because the filter isn't perfect.

------
8bitsrule
Every now and then I hear someone call a radio station using a landline phone
-in good repair-. The difference in quality (despite the LL bandwidth
filtering) compared to mobile-phone callers is often unmistakable. For one
thing, the mobile artifactual garbling and missing audio segments stand out.

So it's helpful to stipulate what kind of 'phone audio' you're trying to
mangle into 'fidelity'.

------
dropoutcoder
Eventide’s classic DSP4000b has algorithms for phone call sonic simulation.
The patches are highly modular (and thus can be inspected for analysis - at
least down to the blocks used to construct the patch) and it would be
interesting to see how that old school hardware DSP approach compares to the
author’s design and implementation.

------
aiphex
I remember landlines sounding quite different from the author. To me they
sound much better than cell phones. It sounds as if the other person is
actually on the other end of the line. Cell phones sound very artificial with
compression artifacts and such.

------
bcrl
He forgot to add in some 60 Hz (or 50 Hz) buzz. Maybe add in a bit of
crackling as well since it's now spring and wet around here...

------
lurker2823
works here, firefox mac, version 74.

