Hacker News new | past | comments | ask | show | jobs | submit login
Android App Reverse Engineering 101 (maddiestone.github.io)
380 points by conductor on May 2, 2019 | hide | past | favorite | 32 comments

The author is yet to complete the tutorial but its an interesting resource, nonetheless.

The tutorial does leave out https://frida.re which offers a runtime no-root reverse engineering mechanism, which I'm currently using it to MiTM apps with cert-pinned TLS.

There's also the excellent FlowDroid and Androguard, the latter of which I've used for static analysis [0].

I recall NateLawson founded a YC startup, SourceDNA [1], that offered intelligence on reverse engineered iOS and Android apps (based on static analysis). I wonder what tools they used.

[0] https://github.com/ashishb/android-security-awesome

[1] https://news.ycombinator.com/item?id=10049925

Wow! How have I not heard of frida? This should make my normal process much simpler!

1) Decompile App -> smali

2) Decompile App -> Java (non-reversible, but easier to read)

3) Search the app for certificate pinning code (check for network_security.xml or grep for OKHttp pinning functions)

4) Find the code I just found in java, in the smali version

5) Remove the pinning code

6) Recompile smali -> apk

7) Fix whatever was causing the smail not to recompile

8) Recompile again

9) Pray

10) Install on device

11) Run app (that hopefully doesn't crash)

12) Pipe connection through Charles proxy

13) Read api calls!

I'll definitely give it a go.

In general I think there are nowhere near enough resources on decompilation, particularly on a purportedly "open" platform like Android. Really looking forward on the rest of the tutorial coming online.

As an alternative, you could use the XPosed framework. It basically lets you overwrite App and System methods and classes by installing a small module. No need to modify your target apk. So you can just return true from the method that checks if the correct certificate is in use.

XposedMod is amazing but you could override methods with Frida, as well. Just that Frida works both with or without having to root the device.

"there are nowhere near enough resources on decompilation, particularly on a purportedly "open" platform like Android."

Most apps are definitely not 'open' and unfortunately most of 'reverse engineering' has nefarious intentions.

Once one has had key code stolen from them, it changes one's perspective a little.

Unless your APK is internal and not distributed outside a small circle, assume your code has been reverse engineered.

Surely if you're running something that has to be that "closed", most of the key code is server side and the client is just calling APIs.

Reverse engineers are not your enemy.

"Unless your APK is internal and not distributed outside a small circle, assume your code has been reverse engineered."

Of course, but doesn't obviate the illegality of stealing protected code.

"Reverse engineers are not your enemy."

Yes, many of them are.

Many of these resources and individuals are involved in illegal and immoral acts around stealing code and IP, and justify it to themselves on some kind of skewed logic.

Nobody really cares that folks are hacking code for fun, and nobody cares that people would use resources for this purpose. This is fine and possibly helpful.

But IP theft is a big deal, and it's very damaging.

"Surely if you're running something that has to be that "closed", most of the key code is server side and the client is just calling APIs"

Unfortunately, this isn't possible in a variety of cases, especially new and upcoming scenarios that require AI to be 'on device'. There are many other such scenarios.

It's very oppressive to a major class of innovation - particularly to those who've worked very hard to assemble something exceptional and useful.

Are you suggesting that reverse engineering resources not be made available because you feel it will lead to people stealing code?

These are great resources, thanks so much.

I know some of my friends, who "steal" famous apps on Google Store, then re-compile and re-publish into their own namespaces.

Even worse, he then reported back the original author for stealing his apps.

What he did, is to steal resources and put his Ads into the stolen app.

I'm not sure if Google could track those things.

That's why i always consider most of Vietnamese apps on Android store are "stolen" in some cases.

In Vietnam, "stealing apps" is a real dark business.

Google could stop distributing pirated apps after complaints. It should be pretty clear who published the app first.

The problem, is the author doesn't always know his apps are stolen.

Sounds like they could do with an automated plagiarism detector, although that has its own gameability problems.

Doing this in a fully automated way that works well is tough. Blindly applying DroidMOSS falls over against simple techniques.

Add to it that lots of repackaged apps are distributed outside channels that Google controls and you've got a hard problem.

Ghidra has a pretty good structure editor and structure definition system. You can import a C header (like a modified jni.h) and then you can declare parameters as being of type “JNIEnv *” - after that, Ghidra will automatically resolve function pointer calls for you. No need to keep consulting an offset table.

(IDA’s decompiler has all of this too, but it costs a lot more!)

What about graph view? IDA will let you use struct offsets in the assembly language, making the assembly much more readable. I'm hearing that Ghidra doesn't, and that you are forced to use the decompiler to make use of the structs.

I'm not completely sure what you're talking about, but if you annotate variables with types Ghidra will show member accesses instead of offsets in the assembly listing.

No, like this: https://www.hex-rays.com/products/ida/tech/graphing.shtml

It is sort of like a flow chart, with the assembly shown for each chunk. I've loaded up functions with over 5000 blocks of code, including one function that was a third of a megabyte in size. Navigation becomes important.

Ghidra is supposedly slow at this scale.

I'm also told that Ghidra seems to not do struct offsets in that view, forcing the use of the decompiler. With IDA the struct offsets can be chosen and viewed, all without involving the decompiler.

Ghidra has a function graph view that shows the control flow for a function.

MobSF[1] is a good tool for anyone in need of reverse engineering an apk for security audit purposes.

[1] https://github.com/MobSF/Mobile-Security-Framework-MobSF

Somewhat related: has anyone found a difference in the quality of decompiled Dalvik bytecode and JVM bytecode, with the former being register-based?

I used to reverse engineer android apps between 2010-2012. I used couple of methods.

1.Dare + JD Decompiler +Cavaj (or) DJ Decompiler

2.dex2jar + JD Decompiler + Cavaj (or) DJ Decompiler

3.AndroChef Java Decompiler

And for selective decompilation, Smali (or) Backsmali with deodexing for system applications.

They all were plagued by different decompilation & retargeting issues of those time. I would love to see how things have changed now.

that begs one question: where to get the apk from?

Next to the other suggestions there's evozi's apk downloader that generates apks based on pasted Google Play URLs[0] and the ability to download apks in the proxied Play Store replacements 'Yalp' and 'Aurora' which are available in F-Droid

[0] https://apps.evozi.com/apk-downloader/

You can either install it from your real hw device and, once connected via usb, do a $ adb pull test.apk or you can download it from these sites:

(I've tried them both for different RE purposes and they also have the latest updates for a lot of apps) APKCombo [1] & APKPure [2].

[1] : https://apkcombo.com/

[2] : https://apkpure.com

From your device using adb pull?

Is there a more reliable way to get the APK programmatically, without having to use an android device as a middleman?

I know of gplaycli (https://github.com/matlink/gplaycli/) but its reliability leaves a lot to be desired afaik.

There is also the Yalp store (open source Google Play service front-end), as well as a bunch of third party websites that will happily give you an untrusted APK (they all claim to be secure and original, but somehow the sha2 hashes of identical versions are n=3 always different for me).

Could it be that it's a case of multiple APK? Perhaps different CPU architecture? In any case I would check the value of versionCode first (https://developer.android.com/guide/topics/manifest/manifest...).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact