Hacker News new | past | comments | ask | show | jobs | submit | biosboiii's comments login

As a reverse-engineer tinkering with iOS this reminded me of some system apps.

E.g. in the app store you click a button, send a request, receive the response which contains a xml-like structure describing the UI mutation to your action.

<Alert>

   <Header>iTunes Login</Header>

   <Body>We could not find a user with those credentials.</Body>
</Alert>

type stuff.


Server-Driven UI is a very common architectural pattern.


My profile is largely unused, I follow no one, and like 1/3 times I open up the front page I get straight holocaust denial threads suggested. Completely insane.


Yeah, you can basically just unzip IPA files. Gaining them is hard though, I have a pathway if you are interested.

But the Objective C code is actually compiled, and decompilation is a lot harder than with the JVM languages on Android.

My next article will be about CoreML on iOS, doing the same exact thing :)


> My next article will be about CoreML on iOS, doing the same exact thing :)

Can't wait - thanks for writing it up!


Thanks a lot :)

My general writing style is directed mainly towards my non-technical colleagues, which I wish to inspire to learn about computers.

This is no novelty, by far, it is a pretty standard use-case of Frida. But I think many people, even software developers, don't grasp the concept of "what runs on your device is yours, you just dont have it yet".

Especially in mobile apps, many devs get sloppy on their mobile APIs because you can't just open the developer tools.


I'm a mobile developer and I'm new to using Frida and other such tools. Do you have any tips or reading material on how to use things like Frida?


I think you are starting off from the perfect direction, being a forward-engineer first, and then a reverse-engineer.

The community around Frida is a a) a bit small and b) a bit unorganized/shadowy. You cannot find that many resources, atleast I have not found them.

I would suggest you to use Objection, explore an app, enumerate the classes with android hooking list classes or android hooking search classes, then dynamically watch and unwatch them. That is the quickest way to start, when you start developing your own scripts you can always check out code at https://codeshare.frida.re/.

For everything else join the Frida Telegram chat, most knowledge sits there, I am also there feel free to reach out to @altayakkus

Oh and btw, I would start with Android, even though iOS is fun too, and I would really really suggest getting a rooted phone/emulator. For the Android Studio Emulator you can use rootAVD (GitHub), just install Magisk Frida. Installing the Frida gadget into APKs is a mess which you wont miss when you go root


Author here, no clue about homeomorphic (or whatever) encryption, what could certainly be done is some sort of encryption of the model into the inference engine.

So e.g.: Apple CoreML issues a Public Key, the model is encrypted with that Public Key, and somewhere in a trusted computing environment the model is decrypted using a private key, and then inferred.

They should of course use multiple keypairs etc. but in the end this is just another obstacle in your way. When you own the device, root it or even gain JTAG access to it, you can access and control everything.

And matrix-multiplication is a computationally expensive process, in which I guess they won't add some sort of encryption technique for each and every cycle.


In principle, device manufacturers could make hardware DRM work for ML models.

You usually inference those on GPUs anyway, and they usually have some kind of hardware DRM support for video already.

The way hardware DRM works is that you pass some encrypted content to the GPU and get a blob containing the content key from somewhere, encrypted in a way that only this GPU can decrypt. This way, even if the OS is fully compromised, it never sees the decrypted content.


But then you could compromise the GPU, probably :)

Look at the bootloader, can you open a console?

If not, can you desolder the flash and read the key?

If not, can you access the bootloader when the flash is not detected anymore?

...

Can you solder off the capacitors and glitch the power line, to do a [Voltage Fault Injection](https://www.synacktiv.com/en/publications/how-to-voltage-fau...)?

Can you solder a shunt resistor to the power line, observe the fluctuations and do [Power analysis](https://en.wikipedia.org/wiki/Power_analysis)?

There are a lot of doors and every time someone closes them a window remains tilted.


Any company serious about building silicon that has keys, wouldn't just be storing them in flash.

Try getting a private key off a TPM. There have been novel attacks, but they are few and far between.

Try getting a key from Apple's trusted enclave (or whatever buzz-word they call it).


You're right about the TPM, I won't get the key out of it. It's a special ASIC which doesn't even have the silicon gates to give me the key.

But is the TPM doing matrix-mulitiplication at 1.3 Petaflops?

Or are you just sending the encrypted file to the TPM, getting the unencrypted file back from it, which I can intercept, be it on SPI or by gaining higher privileges on the core itself? Just like with this app but down lower?

Whatever core executes the multiplications will be vulnerable by some way or the other, for an motivated attacker which has the proper resources. This is true for every hardware device, but the attack vector of someone jailbreaking a Nintendo Switch by using a electron microscope and a ion-beam miller is neglectable.

If you are that paranoid about AI models being stolen, they are worth it, so some attacker will have enough motivation to power through.

Stealing the private key out of a GPU which allows you steal a lot of valuable AI models is break-once-break-everywhere.

Apple trusted enclave is also just a TPM with other branding, or maybe a HSM dunno.


I'll concede you are correct that whether the key is extractable or not doesn't really matter if the GPU eventually will eventually need to store the decrypted model in memory.

However, if NVidia or similar was serious about securing these models, I'd be pretty sure they could integrate the crypto in hardware multipliers / etc such that the model doesn't need to be decrypted anywhere in memory.

But at this point there isn't much value in deploying models to the edge. Particularly the type of models they would really want to protect as they are too large.

The types of models deployed to edge devices (like the Apple ones) are generally quite small and frankly not too difficult (computationally) to reimplement.


Author here, it would be nice to claim that I did this on purpose but I really did not know it was open source.

I was rather interested in the process of instrumenting of TF to make this "attack" scalable to other apps.


I think the comment author means offering inference via Firebase, with the model never leaving the backend.

This works, just like ChatGPT works, but has the downside of 1. You have to pay the computing for every inference 2. Your users can't access it offline 3. Your users will have to use a lot of data from their mobile network operator. 4. Your inference will be slower

And since SeeingAI infers the model every second, your and your customers bill will be huge.


That's what I thought, but the link doesn't say anything about off-device inference, it's only about storing and retrieving the model. There's just one off-hand note about cloud inference.

In any case, yeah you can not download the model to the device at all, but then you have to deal with the other angle - making sure the endpoint isn't abused.

Maybe a hybrid approach would work - infer just part of the model (layers?) on the cloud, and then carry on the inference on the device? I'm not familiar with how AI models look like and work like exactly, but I feel like hiding even a tiny portion of the model would make it not usable in practice


Your second note is very interesting, having looked at the model myself this is very plausible.

For models which use a lot of input nodes, a lot of "hidden layers" and in the end just perform a softmax this may get infeasible because of the amount of data you would have to transfer.

You may have inspired a second article :)


Hey, author here.

I understand that its not very clear if the neural net and its weights & biases are considered as IP, I personally think that if some OpenAI employee just leaks GPT-4o it isn't magically public domain and everyone can just use it. I think lawmakers would start to sue AWS if they just re-host ChatGPT. Not that I endorse it, but especially in IP and in law in general "judge law" ("Richterrecht" in german) is prevalent, and laws are not a DSL with a few ifs and whiles.

But it is also a "cover my ass" notice as others said, I live in Germany and our law regarding "hacking" is quite ancient.


Check out my reverse-engineering/cracking of Microsoft's App just doing that, SeeingAI.

https://altayakkus.substack.com/p/you-wouldnt-download-an-ai


This was a great read. At this point, any org should assume on-device models are public


Thanx! Yeah, they should :) Would love to do this with CoreML on Apple devices, but my newest iPhone is a 7.

But if you subscribe, you may see me doing the same with a surveillance camera soon(isch) :)


Qualcomm being Qualcomm, again and again and again.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: