Hacker News new | comments | show | ask | jobs | submit login
Oboe: A C++ library for low latency audio in Android (googleblog.com)
160 points by el_duderino 34 days ago | hide | past | web | favorite | 74 comments



> Oboe provides the lowest possible audio latency across the widest range of Android devices

The "lowest possible" on android might not be low enough for musical instuments (<10-15ms). It's revealing that they don't reveal latency for any devices at all. (If I sound cynical, this isn't the first low-latency offering for android.) It's just "low" for android.

funfact: consoles also have terrible audio latency. Rhythm games like rockband/guitar hero don't actually need this, they need only relate it to the audio playing (and also config compensation for input latency). The latency is shocking if you use the mode where you free-"play" the instrument, and is why actual musicians often a tough time with these games. The surprising thing is that delayed sound effects from games in general aren't so jarring.

FWIW iphone audio latency also isn't good enough, despite being far better than android. That's how bad android is.

EDIT 7ms claimed in screenshot at https://github.com/google/oboe/blob/master/samples/hello-obo... - doesn't include input latency: average latency of the audio stream between data entering the stream and it being presented to the audio device

EDIT they know they need latency figures "Publish a list of Android device latency" https://github.com/google/oboe/issues/235


Audio latency might not event be the main issue here: Touchscreen latency is far worse which makes any input almost useless for someone trying to actually play notes. Note that changing parameters of filters, synths,... is fine because changing these does not require the same precision as playing.

Btw if you're looking for a subjective number for tolerable latency I'd say ~5ms. This could be output latency if you're playing notes via midi or round-trip latency if you're processing incoming sound and output it again. In my experience you don't get that acceptable latency on notebooks with the onboard sound cards either :)


5ms is bit on the low side. If you're playing from guitar amplifier that's 4 meters from you, you get 12 ms latency just from sound propagation (and most amps introduce additional latency). As extreme example of what the brain can compensate, church organ players have to deal with 100ms+ latency and they still manage to play Bach.


You can compensate for latency with a lot of practice, but playing anything with a sharp attack is not joyful when it's over 10-20ms -- especially so if you are used to playing an instrument which doesn't much of it (i.e. pretty much anything but a church organ :) )


And guitar (and piano) are classified as percussive instruments, for though they have strings, the strings are struck.


> church organ players have to deal with 100ms+ latency and they still manage to play Bach.

but they don't play Bach on pipe organs with a drummer


No, they play it with a choir.


Though perhaps they should.


Can confirm— I'm a piano player who did organ a few times for funerals and the like without having had formal training. If you try to match what you're hearing like on a piano, you'll end up just playing slower and slower; you basically have to play to a click track, either an actual one or the mental equivalent.


When using MIDI, its maximum bandwidth is going to be a huge bottleneck in achieving that kind of latency. At 31250 baud with eight effective data bits per 10 bit packet you're going to get an additional 0.32 ms of latency per byte.

So let's say that you are playing a simple three-note chord. Optimally, MIDI encodes this as a single status byte followed by three note+velocity pairs of data bytes. Just that is going to make something that should be instantaneous take 2.24 ms. Now add some CC messages, maybe a couple more channels, sync bytes etc. concentrated around rhythmic subdivisions and it's easily going to add up to 5-10 ms latency. And not just fixed latency, but jitter since the messages of course come in at different times.

Then, if you use a computer running Windows/Linux/OSX, you should probably account for MIDI driver latency and jitter as well. Maybe you are running a couple of instruments chained, at which point you'll also have to account for the latency between the input and the through output.

I easily hit the threshold where it's not only perceptible for the player but for a casual listener as well, but it's hard to say that it isn't "tolerable" when so much music has successfully been played and recorded with these limitations. Seems weird though that the industry has settled on this 80s protocol built for 80s machines. There's OSC (which is quite underspecified for the problem MIDI intends to address), some individual efforts by manufacturers (e.g. Turbo MIDI by Elektron) and recently talks of a "MIDI HD" protocol. I hope something happens soon that all manufacturers can hop on board.


This is correct but multiport interfaces don't add up those latencies: it is very rare that although a midi data line can drive 16 channels and corresponding instruments, all of them are chained or paralleled via MIDI thru boxes over a single data line (ie, a single MIDI OUT). Data is a lot faster inside the PC and the DAW, so that if you have a multiport MIDI interface, connecting your expanders on different outputs - although not necessary if they use 16 channels or less - will indeed reduce the total latency.

Totally agree about the limitations of MIDI, but it was conceived when I think the only alternative was the Roland DCB which would was even more limited and would cost a fortune in connectors. Today I would use "something" over Ethernet phy: open, very cheap, very fast and near realtime.


> This is correct but multiport interfaces don't add up those latencies:

Well, it still adds up as far as you use the individual ports. If have a single keyboard controller with modulation and pitch bend wheels connected to one port I don't think it's far fetched to say that the latency of the MIDI messages for some instant action can add up to >5 ms on its own. Might be hard to notice, but that's my point! At that latency you might as well be discussing how close you should sit to the speakers. If you're 1.72 m from your speakers, that adds +5 ms latency. I'll reach that distance if I stand up playing with a stage monitor at my toes.

That said, I think that 5 ms here and there does matter. For example, 5 ms vs 10 ms audio buffers on a PC felt like night and day to me. The total time from finger to ear was probably already twice that, but it seems that shortening the audio buffer crossed some threshold where it suddenly didn't feel like I needed to adjust my playing for the latency.

I agree that Ethernet would have made sense today, and I think that some manufacturers already support something like that.


> The "lowest possible" on android might not be low enough for musical instuments (<10-15ms). It's revealing that they don't reveal latency for any devices at all. (If I sound cynical, this isn't the first low-latency offering for android.) It's just "low" for android.

That's probably because talking about "the latency" for Android varies with actual hardware capability. It's like talking about "the latency" for Windows and Linux and forgetting that it's very bound to the actual hardware lying underneath. There are Android devices that achieve your desired level of latency, but obviously they're rather high-end and new.


The cross-platform audio app library JUCE has a benchmark on mobile audio quality published for a lot of phones. Basically only the Pixel and Nexus phones are even close to Apple devices, and for most android phones, the situation is abysmal.

https://juce.com/maq


Does this mean we will finally get decent musicmaking apps on Android?

A decade after iOS, I guess it comes in from the "better late than never department", but I am happy. Apple seems to have lost its focus on the creative usage of their devices, and it has somewhat stagnated (compared to the initial explosion), plus iOS apps tend to be pricey.

Apple's decision to not include the headphone jack effectively puts the kibosh on their realtime audio / musicmaking apps. The latency on AirPods is atrocious. You press a key on a virtual keyboard, and count to three before you hear a sound.

Yes, this is probably solved by a dongle. Eh, I'd rather carry my Yamaha Reface DX around than think about that -- and the developers are going to have a FunTime(tm) explaining this to the users. (The sole reason musicians love iDevices so much is the JustWorks(tm) aspect - well, that just went out the window).

This Android API just might start a new wave. The video appeals to the synth nerd in me. The gear, the T-Shirt, the code samples doing the right thing.

I am looking forward to what this will bring us.


I originally bought an iPad just for music stuff, but a lot of the time I'm hooking it up to an external audio interface for better quality IO and midi. I think that iOS' support for USB Audio and USB Midi devices is also a large part of what helped it build an ecosystem of decent music making apps.

I agree that not having a headphone jack combined with the unacceptable latency of bluetooth makes it unusable for on-the-go music making.

As for pricing, I think that won't change on Android. Music making is a niche market, and these apps are not cheap to develop.


> I think that iOS' support for USB Audio and USB Midi devices is also a large part of what helped it build an ecosystem of decent music making apps.

I hate to go there but I think the other factor is standardized hardware. You're guaranteed certain hardware setups for iOS devices, for Android it's anybody's guess really whether a phone can support some features or not. It would be nice if all Android phones had USB C, of course it would of been nicer if USB C had better regulation since now all USB C interfaces are like... Ah well a guy can dream can't he.


Raph Levien worked for years on improving Android audio latency. I had a great chat with him a long ago about emulating the DX7, APIs and so on. I thought they already managed to get numbers to reasonable levels (after adding tighter requirements to the Android certification suite, among others) and even gave a talk at I/O. I'm surprised he's not involved in this, but then this is C++; he's been really into Rust for a while and perhaps back to working on typography alone — assuming he's still at Google, on the Android team.


Wrong on both counts. However, I'm very much back in the audio synthesis game, and will give a talk at next month's SF Rust Meetup on the topic.


Looking forward to seeing your results with making rust work on audio and a cross plat gui. I'm a rust noob but willing to try and help if possible since I want to make audio apps with a sane language like rust.


Would love to hear you're planning to put up video and/or slides -- that way I don't have to drive up from Los Angeles. Or get around the waitlist of 30 people for a full event. :)


Yes to both, and also plan to write blogs about the content, and documentation for the library as it matures. I'm gratified to see the interest!


Looking forward to it! It has always seemed to me like such a great application of the language, having done a fair amount of low-latency A/V programming in the C[++] realm.



I'm not sure how they are doing it, but if you enable "Live Listen" mode with AirPods (basically turns the iphone into a microphone and the AirPods into a hearing aid) in iOS12 the latency is almost imperceptible.

https://www.engadget.com/2018/06/05/live-listen-ios-12-apple...

When using this mode to listen to my TV, I can just tell that there is a delay between the real sound and those from the AirPods but it's not long enough a delay that I can't understand the speech or really bothers me.

If anyone has technical info I'd love to know. As I have also witnessed that in normal application you easily have 1-2 seconds of latency. Obviously videos synchronize with it but it's not ideal for games etc.

I do wonder if it's a matter of some settings the apps need to potentially set or some such. If you think about it calls are also lower latency than this .. though they use a different BT audio profile (HFP/HSP) that is mono instead of the A2DP Stereo Protocol.

Last of all I was surprised Apple didn't come up with some custom protocol mode for AirPods that allows Stereo+Mic. Gaming consoles do this on 2.4GHz even with multiple controllers so it seems possible in theory (though perhaps maybe not alongside 2.4GHz WiFi? not sure).

Would love more insight on all the above.

EDIT: Side note to the OP, just use a lightning adapter :-)


>When using this mode to listen to my TV, I can just tell that there is a delay...

That absolutely destroys it for music apps. A decent latency for music apps is under 10ms. You don't hear it, you feel it when you press a key and you don't hear the sound immediately. It prevents you from playing fast, even if you kind of adjust to it.

Your brain is very sensitive to timing differences. Hit a drum with two sticks - you can hear the difference on beat and almost-on-beat very easily. And if you are trying to use vocal effects and hear your own voice delayed - that just messes with your brain.

Read this article for more background on how latency affects musicians and what number are acceptable: https://www.churchproduction.com/education/latency-and-its-a...

As for the side note - I played with my friend's iPhone, and decided to stay away from that device :) I still have an old iPod Touch which runs a looper just fine.


Google is only interested in tablets that run ChromeOS, and just announced one without a headphone jack. Not sure if the low-latency audio is making it to the CrOS/Android environment either.


I'd say it's unlikely, mostly because you'll still be limited by quality of audio drivers on actual devices. And most OEMs don't really care, making the actual addressable market for audio apps way smaller.


Having seen the Windows side of it, I am not worried to much about the drivers - even Realtek built-in soundcards can pull off decent latency.

Touch screens, on the other hand, are a problem -- they add input latency if you want to use the screen as an instrument/controller, and that's what a lot of music apps do (virtual keyboards/drum pads/etc).

Last time I really tried those, every single device was atrocious in that respect (both iDevices and Android)[1] -- keep in mind that total latency from touch to sound should be under 10 ms to be really acceptable.

And things haven't been getting better until recently. Short of music equipment, the trend for everything has been to get more sluggish[2]

Razer Phone and iPhone X might change that[3] if others follow suit, so I'm hopeful (both feature 1/120sec touch latency, which is OK).

[1]https://www.imore.com/iphone-5-touchscreen-latency-measured-...

[2]https://danluu.com/input-lag/

[3]https://www.macworld.com/article/3235709/iphone-ipad/iphone-...


Have you tried Ableton's Link technology on Apple? I know a few DJs who find this solution works well as it will sync with their desktop DAW.


I haven't - it seems to be an orthogonal technology to what is being discussed here.


Typical Android NDK library from Google, instead of being added to the NDK, Android Studio and respective documentation, we get a github repository and "good luck" pat in the back.

And maybe if they have enough buffer on their 20% we get some of the reported issues fixed.

Like cdep, Play Services for C++, FPL libraries...


The proliferation of apps for musicians on iOS (drums, guitar amps) is explained by the iOS's longstanding API for low-latency audio. I hope we can expect the same kinds of apps, with usable latency, for Android now too.


I contacted one of Oboe authors and I reached conclusion that this doesn't achieve anything you couldn't already get with openal-soft lib. You can expect some 20ms of audio delay, plus 16-33ms of touchscreen latency .

If you are curious about latency, check out my android app "Hexpress" (on play store) that gives you drums and other instruments to play with. I measured roughly 50 ms of tap-to-sound delay, not consistent across devices. That lands it within 'cute audio toy' territory, which is why I stopped investing time.


That's too bad: on my OnePlus 6, the latencies for your app are great. It's the first time I'm drumming in an Android app, and am not bothered by latency. Thank you for having the source on Github, I'll amuse myself this weekend checking the code.


I looked at the implementation of the openStream() function:

  AudioStream *AudioStreamBuilder::build() {
      AudioStream *stream = nullptr;
      if (mAudioApi == AudioApi::AAudio && isAAudioSupported()) {
          stream = new AudioStreamAAudio(*this);  

      // If unspecified, only use AAudio if recommended.
      } else if (mAudioApi == AudioApi::Unspecified && isAAudioRecommended()) {
          stream = new AudioStreamAAudio(*this);
      } else {
          if (getDirection() == oboe::Direction::Output) {
              stream = new AudioOutputStreamOpenSLES(*this);
          } else if (getDirection() == oboe::Direction::Input) {
              stream = new AudioInputStreamOpenSLES(*this);
          }
      }
      return stream;
  }  

  Result AudioStreamBuilder::openStream(AudioStream **streamPP) {
      if (streamPP == nullptr) {
          return Result::ErrorNull;
      }
      *streamPP = nullptr;
      AudioStream *streamP = build();
      if (streamP == nullptr) {
          return Result::ErrorNull;
      }
      Result result = streamP->open(); // TODO review API
      if (result == Result::OK) {
          *streamPP = streamP;
      }
      return result;
  }
Doesn't this leak memory if streamP->open() fails? I'm also surprised they are using new instead of unique_ptr, especially since one of their listed benefits is "Convenient C++ API (uses the C++11 standard) ".


Yeah, seems like memory leak to me too. streamP is leaked if open fails.

And this in modern C++ in year 2018:

if (result != Result::OK) { goto error2; }


One the face of it, it does look like the memory pointed by streamP leaks. Unless build () kept another pointer to be used in the destructor.


I used Oboe (and its predecessor, Howie) to make a self-balancing robot on Android. We were able to use the headphone jack (RIP) as a low-latency output to send serial data to an Arduino[0].

Android Studio has also finally improved NDK support enough that writing large chunks of your app in C++ isn't painful anymore.

[0]: https://davidawehr.com/blog/audioserial/


Unfortunately, this is unlikely to improve latency on devices which already don't have 'good latency'. Reviewing the linked doc, this is a 'high level' API whos goal is possibly more about streamlining the API across multiple recent releases, while I'm sure, addressing latency as best it can. The problem is that this API still sits on top of other lower (and already existing api's) aaudio for current releases and opensles for older. A large part of Android audio latency is down to the HAL layer that each vendor provides. Andoid as a framework offers support for varying degrees of latency, but it's up to the vendor supplying the audio HAL to support this. This is exposed through the low_latency prop check (or 'professional' in recent releases). If the vendor HAL doesn't support 'low latency' through reduced buffer sizes and efficient (or no) locking etc, there's nothing a high level API like this can do, other than bypass the HAL and talk to ALSA directly, but this breaks the whole architecture.


It may be interesting to recall that collabora did port the linux professional JACK audio server to Android and it has far less latency than audioflinger (btw in french audioflinger phonetically mean to destroy the audio (quality), which is kinda appropriate).

Sadly no OEM ever cared to my knowledge to integrate this work in their ROMs. I also openened an issue on the android bugtracker to feature request replacing audioflinger with alsa/pulseaudio. They never answered nor closed the issue but I haven't checked since 2 years.

Edit: here is the android feature request https://issuetracker.google.com/issues/37073168 Maybe if you comment/upvote it it may gain traction.

Digression: If you are interested in audio latency, I recall that Firefox has better webaudio performance than Chrome.


Unrelated, but there's a code example on that website:

    AudioStreamBuilder builder;
    AudioStream *stream = nullptr;
    Result result = builder.openStream(&stream);
Why even bother initializing stream to nullptr if it's going to be overwritten anyways by AudioStreamBuilder::openStream?


You'll probably get a warning about using an uninitialized variable if you don't. Compiler probably can't tell the variable will only be written and not read.

It's also just good practice to initialize your variables.


I think it's good practice to avoid having uninitialized memory.

Just in case, for example, the initialization gets moved inside of an if statement and no one thinks about the else case.


Thought the same recently and omitted the initialization. Then Lint yelled at me. I decided it's not a fight worth fighting since it basically costs nothing and is more robust in the face of change.


Unfortunately this won’t help with the terrible hardware level latency on many Android devices.


It's not the hardware per see, ALSA could certainly be improved by vendors, it's the intermediate layers that cause the problem. Android needs to enforce low latency upon vendors, and not continue to give them the easy option of avoiding it.


Audio latency issues on PCs are largely alleviated thanks to better softwate (device drivers that shorten the path to/from the hardware - such as ASIO4ALL) combined with better hardware (soundcards interfacing over PCI-E/Firewire/Thunderbolt/USB) at affordable prices. Plenty of audio cards today will offload computation off the CPU via discrete DSPs.

On Android devices (and smartphones in general) - hardware would be the limiting factor.


Not even close. You can have 1 millisecond latencies with a well configured kernel (PREEMPT), directly talking to ALSA. I'm talking both Qualcomm and Exynos CPUs.

It just takes a SCHED_FIFO task with forced CPU affinity. Android does not make it easy to get one.

There are some other hardware issues, like audio input and output using separate clocks on Qualcomms, but that's at most one extra buffer.

Speaking of this, you can have sub-millisecond latencies on Beagleboard. These devices in phones are vastly more powerful.


> "It just takes a SCHED_FIFO task with forced CPU affinity. Android does not make it easy to get one."

Are you aware of any non-RTOS that "makes it easy to get one"?

> "Speaking of this, you can have sub-millisecond latencies on Beagleboard."

I did not make myself clear, so I'm taking the blame here.

I'm mostly interested in the use case. If it's just capturing audio alone, this number makes sense, and no fancy hardware is necessary.

The moment we're introducing some-kind of processing, or even logging the stream to disc, buffering becomes necessary and latency is introduced. Assume audio stream read by a "user-mode" service, then redirected out through headphones - are we still talking sub-millisecond latencies?


Audio is an extraordinarily low demand task, and hardware is not the issue, and hasn't been for a couple of decades (even DSP offloading has virtually disappeared). A smartphone has many, many magnitudes of excess performance to handle extremely low latency audio. The iPhone has had a 7ms latency for many generations, and the 7ms is not some intrinsic limit but is simply a decent balance between low power usage and ensuring no glitches.

The problem on some platforms is one of architectural choices and prioritization. On Android we know they started with the already arguably poor latency Linux audio foundation of ASLA, then layered on and layered on (flingers and HLAs and user-mode transitions), each layer adding its own ring buffers.

Even on Windows, on the fastest PC known to man, audio has generally poor latency (because it's architectural) which is why audio software makers have their own hardware->application drivers (ASIO).

Low latency audio was not important to the project, and they dug themselves in so deep that for many years we've been hearing recurring "We've finally solved that latency issue" claims.


Check out the Bela project[1], which has back-ends for a few different audio packages (e.g. PD and SuperCollider) that totally bypass the kernel and handle the audio callback directly (maybe from a bare-metal ISR, not exactly sure). That’s how they get sub-ms round-trip latencies.

[1]: http://bela.io


Completely agree. Hardware itself doesn't introduce any noticeable latency, it's just whole lot of buffers at various software layers because audio stuttering and power consumption is always prioritized over latency. Some devices actually support low-latency in audio drivers by conforming to Android Professional Audio. I haven't yet gotten chance to test any of those devices if improvement is noticeable.


What about all the delays from the use of java? Even if your audio app is in c++ for performance, it's still competing for resources with many other java apps easily using gigabytes of ram just idling. I don't see any way to mitigate that.


Not a problem at all, that is just the typical Java FUD and bad coded apps.

The real time audio APIs are in native code and they can request for priority use when on foreground.

Samsung used to support real time audio on their S models since many years.

https://developer.samsung.com/galaxy/professional-audio

Which they are now deprecating as they also contributed to the design of AAudio.


I can "feel" the lag in the android experience and it's the same lag that I get from desktop apps written in java. It's not just FUD. Almost always when there's a memory-bloated application it's written in java. Hell, adk won't even run unless you increase the -Xmms setting to 4+ gb.

People rightfully pointed out that tablets are plenty powerful to do DSP type applications, but something is consuming all the resources. I'm just saying what that something is.


Don't confuse the language with the ability to program.

I lost count the amount of times I have fixed junior code doing what should be background stuff written on the main thread, for loops instead of System.arraycopy, allocating memory in loops and lots of other stuff due to lack of proper teaching.

As for Android, the real time audio stack is fully native and Google had to learn from Samsung how to do it properly.


Wonderful! Latency has been a massive blocker for audio on Android - this makes me want to break open the SDKs again and tinker.


So what are the latency numbers for some common devices? (apart from the 7 ms visible in the hello oboe screenshot)


'low_latency' used to be sub 20ms, however 'professional' was introduced a while ago to replace this, with 'low_latency' becoming 40ms


Sorry, Actually, it's 45ms. From the docs: android.hardware.audio.low_latency indicates a continuous output latency of 45 ms or less. android.hardware.audio.pro indicates a continuous round-trip latency of 20 ms or less


Can anyone recommend the best way to do low latency audio on windows and osx?


For Windows, ASIO! http://www.asio4all.org/


I think wasapi has more or less superseded asio and has more support from Microsoft.

If you're programming in Rust, you can use the cpal crate, which wraps all these. I'm not sure cpal has astonishingly good performance in all cases, but I plan to work with the authors. It doesn't have an Android back-end yet, but such a thing should be possible.


Excellent. Trying to do sine wave generation, on and off at low latency.


macOS has very low latency with the standard APIs.

If you use JUCE (https://juce.com/) you'll get a single API for all the different audio APIs on macOS, Windows, iOS, Android, Linux, etc.


Would this make it possible to build active noise cancelling using software?


I wouldn't think so. If your cancelling signal is half a period out, you'll double the volume instead of cancelling it. That suggests the required latency is on the order of microseconds, not milliseconds.


Android already has an aec module that you can use. It's timestamp based using the timestamp calls to ALSA to perform the cancellation


You guys should also check out AudioKit https://github.com/AudioKit/AudioKit


I don't see an Android version mentioned there?


Because there is none, this is for Apple devices. For android you have that SDK, though I didn't check the licensing terms:

https://superpowered.com


The title should probably mention "for Android". It's misleading without that context.


Thanks, updated!




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: