
iOS Apps Using Private APIs - tptacek
https://sourcedna.com/blog/20151018/ios-apps-using-private-apis.html
======
NateLawson
Hey all, I'm the founder of SourceDNA and happy to answer any questions about
how we found this or about binary code search in general.

We take a different approach to understanding code than the traditional
antivirus world. Rather than try to hunt for a needle in a haystack, we've
created a system for finding anomalies in code that's already published. For
example, you can build a set of signatures for "bad apps" and then repeatedly
search for them (AV model) or you can profile what makes an app "good" and
then look for clusters of apps that deviate from it (SourceDNA).

Consider an ad SDK like Youmi here. They weren't always scraping this private
data from your phone. There are some apps that have this library but that
version is a typical, only sorta intrusive, ad network.

But, over time, they began adding in these private API calls and obfuscating
them. This change sticks out when you track the history of this code and
compare to other libraries. There was more and more usage of dlopen/dlsym with
string prep functions beforehand. This is quite different from other
libraries, where they stick to more common syscalls.

By looking for anomalies, we can be alerted to new trends, whatever the
underlying cause. Then we dig into the code to try to figure out what it
means, which is still often the hardest part. Still, being able to test our
ideas against this huge collection of indexed apps makes it much easier to
figure out what's really going on.

~~~
vitd
One concern I have is that they'll just move this one step further down the
road. For example, I believe that you can get the address of a function like
dlopen() by manually loading the bundle and looking up the function name (via
something like CFBundleGetFunctionPointerForName()), constructing the name
"dlopen()" through some obfuscated method as they do with Objective-C symbols.
Then it becomes harder to detect that they're even using dlopen(). Any plans
on how to detect that? It seems like an arms race that can't easily be won.

~~~
NateLawson
You're exactly right. You can go even further in gathering environmental data
from syscalls and using it to construct strings at runtime. At some point, you
have to include dynamic analysis in addition to static analysis.

The iRiS paper we mentioned in the blog post describes a really great approach
to doing this. They do "forced execution" using a port of Valgrind to iOS.
They also do the exact right thing by resolving as many call targets as
possible statically, then using dynamic analysis only for the call sites that
can't be resolved. This saves on runtime and complexity, though you might
notice that even this approach didn't resolve 100% of them.

Ultimately, you're dealing with a variant of the halting problem, where the
app uses a specific value only on a full moon, on iOS 4.1, where the username
is "sjobs". And that's why computers are still fun.

PDF:
[http://www.cse.buffalo.edu/~mohaisen/classes/fall2015/cse709...](http://www.cse.buffalo.edu/~mohaisen/classes/fall2015/cse709/docs/deng-
ccs15.pdf)

------
makecheck
I'm actually very surprised this hasn't happened years ago. The power of
Objective-C's runtime has always made this pretty straightforward.

Apple can defend against unauthorized calls to even runtime-composed method
names though. I can think of a few ways.

They could move as much "private" functionality as possible outside of
Objective-C objects entirely, which requires that you know the C function name
and makes it obvious when you've linked to it. This should probably be done
for at least the really big things like obtaining a device ID or listing
processes.

Even if they stick with Objective-C, they could have an obfuscation process
internal to Apple that generates names for private methods. Their own
developers could use something stable and sane to refer to the methods but
each minor iOS update could aggressively change the names. If the methods are
regularly breaking with each release and they're much harder to find in the
first place, that may be a sufficient deterrent to other developers.

They could make it so that the methods are not even callable outside of
certain framework binaries, or they could examine the call stack to require
certain parent APIs. At least that way, if you want to call a private API, you
have to somehow trick a public API into doing it for you.

And, I think Apple does say somewhere that developers shouldn't use leading
underscores for their own APIs. They could hack NSSelectorFromString(), etc.
to refuse to return selectors that match certain Apple-reserved patterns in
all circumstances.

~~~
BooneJS
What's so magical about Obj-C's runtime that this cannot be detected by Apple?
If a third-party can scan for them, why can't Apple? If you're going to have a
walled garden, by all means please ensure the walls can't be scaled or dug
under.

~~~
K0nserv
AFAIK the combination of NSSelectorFromString and NSClassFromString would
allow you to build and make calls from obfuscated strings during runtime.
Since Objective-C uses message passing this is non trivial to catch. Compared
to C were you must explicitly reference symbols that are easy to automatically
check for detecting the use of third party APIs is more difficult.

I am not sure if you have to explicitly link against certain frameworks/dylibs
though, someone with more knowledge feel free to correct me

~~~
uxp
The article touches on how to dynamically link into a framework/dylib at
runtime with dlopen(3) and dlsym(3) to resolve it's symbols address space.

I've always wondered why Apple doesn't run all apps against a "debug" build of
iOS that asserts that the caller of private APIs is itself internal/private to
Apple, but instead relies on something akin to grepping the output of
strings(1)

~~~
K0nserv
This seems like a good approach, but you can still get around it by detecting
apple reviews and not doing anything with private APIs during them

~~~
kenrikm
It's quite trivial to setup a method that implements "NSSelectorFromString"
[1] etc.. and have it read from a Json payload send from a server. It's very
hard for Apple to check for that unless they have active monitoring on apps
after the review process.

[1]
[https://developer.apple.com/library/ios/documentation/Genera...](https://developer.apple.com/library/ios/documentation/General/Conceptual/DevPedia-
CocoaCore/Selector.html)

~~~
NateLawson
Right. Any runtime behavior can be altered by observed state from outside the
phone.

There's even a paper on intentionally inserting security flaws into your code
and then exploiting them from your own server to change execution patterns:

[https://www.usenix.org/conference/usenixsecurity13/technical...](https://www.usenix.org/conference/usenixsecurity13/technical-
sessions/presentation/wang_tielei)

Ultimately, you need to enforce access control instead of just trying to
detect problems a priori. Apple's sandbox is a great start to that, and I
expect they'll keep improving it to block apps like these.

~~~
jandrese
The state doesn't even have to be from outside the phone. You could have an
internal timer that kicks off 2 weeks after you've submitted to the App Store
(to allow for unexpected delays) and switches on the evil behavior.

------
kinofcain
Seems like a lot of the things they're putting behind private APIs should
instead/also be behind a user permission. Getting the list of installed apps,
device serial number, and the users email address shouldn't be protected
simply with obfuscation.

~~~
musesum
Yeah, the user's email address is a surprise. It's pretty easy to bypass a
simple scan with objc_msgSend.

Conversely, it would be nice if the user could grant access to email and
iMessage sandboxes. This would allow us to apply machine learning to
personalize services. Ironically, by allowing opt-in, Android is an easier
platform for creating private services.

------
BinaryIdiot
I'm not an iOS developer (well, not really; I don't know what I'm doing) but
this seems like it would be a really easy thing for Apple to detect. Does
Apple simply not care about access to these to the degree of adding in better
checking or is there something fundamental about their platform that makes
checking for Apple seriously difficult?

~~~
RandallBrown
Objectiv-C works by sending a message to an object. If that object responds to
the message, it does something. The message is essentially just a string and
in fact you can use strings to build selectors.

Normally, these private API selectors would show up in a class dump and Apple
will reject you app. But, if you're clever, you can hide them from a class
dump. You could encrypt the strings, then decrypt them at runtime and Apple
could no longer find your private API usage in a static scan.

At runtime they _could_ detect you calling private APIs, but it would be easy
enough to code it so that you don't call any private APIs for a few days after
first launch or make them so they're turned on with a server side flag. That
way Apple would never notice the private api usage during an App Store review.

~~~
achr2
The question is why don't these APIs require authorization in the first place?
Obfuscation is a terrible security practice.

~~~
josso
Because as soon as Apple publish something as an API, they can't change the
method because third party apps could now be depending on it. As long as the
API is private and undocumented, Apple can change the method from release to
release as they see fit.

~~~
JupiterMoon
But what about making them private APIs that also require authorisation.

------
a3n
> Since we also identify SDKs by their binary signatures, we noticed that
> these functions were all part of a common codebase, the Youmi advertising
> SDK from China.

> We believe the developers of these apps aren’t aware of this since the SDK
> is delivered in binary form, obfuscated, and user info is uploaded to
> Youmi’s server, not the app’s.

Know your binaries?

~~~
adevine
Which is impossible with any app of reasonable size that uses third party SDKs
(which are often required for business reasons). Apple didn't catch it, how
can we expect hundreds of small developers to catch it?

~~~
mikeash
Apple spends a few minutes reviewing any given app. The developers have much
more time to examine this stuff.

~~~
NateLawson
I disagree that developers can be expected to find this kind of thing on their
own, say by digging in with a debugger. It's simply not a scalable solution.

I think you need a appstore-wide view of the entire software world and the
ability to query it for arbitrary behavior. But I'm biased since that's what
we've spent years building. :-)

~~~
a3n
In the case of the SDK in question, how much due diligence on the developers'
part would you say is appropriate? Not _necessarily_ binary analysis, but more
along the lines of knowing who you're getting your tools from, reputation,
whatever, etc.

Because as far as I, a user is concerned, that developer put that software on
my phone.

Maybe I as a user have a similar due diligence burden. One thing I might do is
never download an app from the affected developers again. But that doesn't
seem like a desired outcome. Nevertheless, I don't see how a user could do
anything more fine-grained and nuanced than that.

~~~
NateLawson
Really great question. Certainly, there are "too good to be true" offers out
there you might want to be concerned about, such as ad networks that pay crazy
high CPMs.

I'm biased, but I think developers should be using our service to track the
code that they're putting in apps (including their own code) for security,
quality, and app review problems. We watch third-party code for them, which is
how we found this issue. I think it's unreasonable to expect developers to
reverse-engineer every library they include.

I agree users really can't do as much about this. We are considering ways to
distribute the list of vulnerable apps/versions to help users find if they
have them.

------
musesum
This is not new. Check out "Microsoft AARD code" \-- an inverted example of
surreptitious analytics, in 1992. TL;DR: the beta version of Windows 3.1
showed a warning if user was using DR-DOS, a competing OS. The payload was
encrypted and could be triggered in the production version of Win 3.1 by
changing a flag.

~~~
yuhong
Yea, there was a reason why I mentioned DR-DOS when I was discussing the OS/2
2.0 fiasco.

------
peterclary
"The apps using Youmi’s SDK have been removed from the App Store and any new
apps submitted to the App Store using this SDK will be rejected." I'll be
interested to see what happens to Youmi now that they're blocked from iOS. SDK
developers: Consider yourselves warned.

~~~
knd775
I imagine that new versions with the private API stuff removed will be
allowed. I don't believe that this is some sort of lifetime ban on Youmi.

------
pradn
How did SourceDNA have access to millions of iOS app binaries? Can anyone just
download all the apps in the App Store?

~~~
asadlionpk
I think you can download .ipa file using iTunes. Alternatively, using iFunbox
to backup apps.

~~~
jevinskie
AFAIK you still have to run those on a jailbroken device, letting the kernel
decrypt the __TEXT segment for you, then dumping the decrypted binary from
memory to disk. Though every binary is encrypted the same, no matter the
device, so supposedly there may be some key that you can extract to enable
decryption off-device.

------
viraptor
Blog about 2 groups discovering the bad apps and reporting it to Apple, but
then: "Apple has issued the following statement. “We’ve identified a group of
apps that..." Stay classy Apple - great attribution.

------
anonymousDan
Is it just me or are they totally ripping off the research done by the Iris
team and making it sound like they came up with these vulnerabilities
themselves? I know they give the researchers a cursory mention, but it's
buried at the bottom of the article.

~~~
tptacek
No, that's not at all what they did. They discovered the issue independently
and, as you can see if you read the whole article, in a manner very different
from that of the other team.

(Fair warning: I am not just a disinterested observer of SourceDNA).

