Hacker News new | past | comments | ask | show | jobs | submit login
Reverse Engineering an Unknown Microcontroller (dmitry.gr)
326 points by dmitrygr 79 days ago | hide | past | favorite | 52 comments



> The thing about humans is: they're human. Humans like nice round numbers. They like exactness, even when it is unnecessary. This helps a lot in reverse engineering. For example, what response does the constant 0x380000 elicit in you? None? Probably. What about 0x36EE80. Now that one just catches the eye. What the hell does that mean? So you convert that to decimal, and you see: 3,600,000. Well now, that's an hour's worth of milliseconds. That length is probably only useful for long-term low power sleep. I have lost track of how many things I've reverse engineered where constants of this variety lit the way to finding where sleep is being done!

Really great work! This was probably the simplest yet coolest insight from this write-up.

Flipping this on it's head, anybody can recommend a reading list on more robust strategies to obfuscate this sort of breadcrumbs?


You can do a few things:

- Laser off the part marking. Not knowing what a part is makes the job much more difficult

- One time programmable chips: can't modify or read off firmware if the JTAG bus is disabled

- Encrypted firmware: helps if someone is able to fuzz the chip to dump the firmware

- BGA parts: hide the pins, bury the traces. It makes the job harder but not impossible

- Programming before soldering: you can leave the programming pins disconnected so someone would have to remove the chip before attempting anything on it

- Use more advanced features of the chip: some chips offer secure memory locations that can contain decryption keys, magic numbers, whatever you want. You could have a magic number that you XOR with every literal. It would certainly make things more difficult to determine what is what in the assembly code if you could decrypt it

- Pour some epoxy over the chip or board: makes repairs impossible but also can screw over the reverse engineer.

- Work with a manufacturer to build a custom chip. You could do crazy things like move the programming pins around and hide them as other things. Like the JTAG test points would be random decoupling caps hidden in the board.

- Finally, threaten to sue anyone that publishes anything


How would you protect (non-SaaS) software against copying or reverse engineering?


You can't. It goes against the very nature of the medium, like trying to delete something on the internet. If it's something a CPU has to execute, it has to be in memory where it can be dumped. At best all one can do is make it harder to stop less determined adversaries.

That said, there actually is one nasty [1] workaround: run some critical functionality on a custom USB dongle that the user has to have connected in order to use the software. It could be a calculation in a critical path that's not compute bound but without which the software is unusable. It could even be a JIT engine that consumes encrypted code and returns polymorphic executable code designed to be near impossible to assemble back into a static binary. Some fabs can make tamper-resistant ASICs with a specialized packaging process that couples the on chip memory to the package so that opening the package makes the memory unrecoverable for extra security. This level of protection would be effective against all but the most determined and well funded nation state or competitor.

[1] Nasty for the user, the developer, and the investor all in one!


Sounds like someone could build a company on that idea.


That's basically what the casino gaming industry is. They're done of the most physically secure systems against reverse engineering that you're likely to run across. More so than atms or a lot of other secure systems.


Something like that needs a threat model including your attackers' motivations and means.


A good real world example of how security by obscurity is actually more secure.

It seems that many have interpreted the idea that security by obscurity means that any obscurity is completely useless. But I'm sure whoever coined that phrase simply meant that if your only security is obscurity, then you are going to have a bad time.

The reality is, obscurity can be a great additional wall of defense. Something that the real world has known since forever (think hidden safes or unmarked money trucks that rotate their schedules on random intervals).


Use different units. 1 hour = 2.976 millifortnight


>anybody can recommend a reading list on more robust strategies to obfuscate this sort of breadcrumbs?

Did you maybe mean to de-obfuscate?


Probably not, that's why they said "Flipping this on it's head"


Sorry, English is not my native language so i didn't get the head flipping part.


"turn something on its head" normally means "to treat or present something in a completely new and different way[1]" e.g. when trying to fix a machine, flipping a part over to see if it fits better "the other way"

In this conversation, it was used to mean "flipping the script", to change or reverse something dramatically

The grandparent comment wanted to ask "the opposite question", not "how to make it easier to reverse engineer something", but "how to make it harder to reverse engineer something".

[1] https://www.collinsdictionary.com/us/dictionary/english/turn...

[2] https://idioms.thefreedictionary.com/flip+the+script


Can confirm, the question was on obfuscation strategies when designing rather than Reverse Engineering.


Brought to you by the legend who managed to run Linux on an 8-bit microcontroller: http://dmitry.gr/?r=05.Projects&proj=07.%20Linux%20on%208bit


Knowledge of East-Asian languages are very useful for such work. Unfortunately I don't know Korean at all, and can only recognise a bit of Chinese and Japanese, but I suspect you might be able to find more information (in Korean) on this MCU if you searched in Korean. I'm not sure how Korean culture compares, but there's a few Chinese electronics forums where people will share otherwise unobtainable datasheets.


I'm not sure to what extent Dmitry went through the literature and translated, but my wife did the translation of the MCU block diagram for him, which probably helped confirm some ideas of what certain elements were once he saw it.


Google translate handled all but that pic, Theo! Thnx!


I'm only aware of pudn. What are the others?


51rd, 51cto, 51hei, and a few others whose name begins with 51 (8051 heritage?) and elecfans often come up in my search results.


51 sounds similar to "I want" in Chinese.

https://newrepublic.com/article/117608/chinese-number-websit...


Great article, always like to read articles about hardware which I know very little.

Just a small suggestion, for the image "segmentTagBig", I think it's better to orientate the side-by-side image the same way (i.e. flipping by longer side/central line, instead of shorter side). It took me a while to understand the layout.

A quick photoshop: https://i.imgur.com/ForiIkY.jpg (you can go further and mirror one of the side so every component's location is exactly the same; but that may cause confusion.)


That DOES look better. I'll do that. Thanks


We’re really lucky that there’s a few talented hackers kicking around that can also share their insights clearly like this.

That was an epic read. I loved that. So much knowledge, so much experience brought to bear on the problem.


This guy is a legend on Palm OS aficionado circles. Back in the day he was pushing Palm OS to the limit and fixing issues that Palm was overlooking. His software was really hard to crack too. Truly great to see some of his work on other stuff. Dmitry, you rock!


As someone who has been messing around with 8051 controllers because I found a tube of 1989 vintage Signetics P80C550s in a box of stuff I hadn't looked at in 15 years, I'd love to hear about these show stopping bugs in SDCC that were referenced but not explained.


most annoying one:

Large model. SDCC v 4.1 or 4.0.4.

Place two sequential (do not declare anything between them) uint16 vars in __idata

Use both in some math statements

Eventually (after 4-5 statements, once compiler has to start spilling intermediates to RAM), on a read access, SDCC will access the first one when you intend to access the second. (emits two unneeded "dec R0" instructions for no reason)

Declaring the vars as "volatile" helps sometimes, but changing code shape sometimes brings the bugs back. not placing two u16s in a row into idata seems to avoid it


Ah, I'd been using the small model up to this point, so that's probably why I haven't bumped into this.


Just amazing, these type of reads make me wanna move to hardware and stop doing silly web projects.


Funny. I'm (OP) considering moving to web projects cause those are more conducive to permanent WFH. Hardware requires physical presence too often.


Yeah, but the ultimate WFH experience I've ever had was consulting for someone who needed a UI to control a large motion system. I lived about 300 miles away, so we set up a remote PC connected to the system, along with a camera and Teamviewer.

That way I could code from the comfort of my home and watch the axes (this was for a CAT scanner big enough for a horse) move and make sure I didn't ram anything against the physical stops. Plus it was helpful to make sure no one had their hands in the mechanism before I started telling large servomotors to move around.

It worked really well as long as they remembered to leave the lights on before they went home for the day.

The fun I had almost made up for my ridiculously low bid!


Can confirm- hw trying to move to software. Sw is easier/convenient to tinker and debug, no restrictions on lab and hw access and sadly also pays better :(


@ both parents - sounds like a challenge to create a robot to be remotely controlled from home to allow you guys move electronics in the real lab in order to do your job.

Eager to read about it here on HN top in a few months Dmitry!


That's great, until you have to debug the robot...



Thanks! I really enjoyed that read. This guy is amazingly talented!

It's been a long time since I've done this kind of thing. It's almost nostalgic, to read this.


This seems like a pretty generic little microcontroller.

Why didn't the makers of this price label just use a little 10 cent off the shelf microcontroller? I doubt price tags are made in sufficient volume to ever get the engineering costs of a custom microcontroller low enough to go below 10 cents...


> Why didn't the makers of this price label just use a little 10 cent off the shelf microcontroller? I doubt price tags are made in sufficient volume to ever get the engineering costs of a custom microcontroller low enough to go below 10 cents...

Are you sure about that? As a hypothetical example, imagine that Walmart is switching all of their stores over to eInk price tags. I found some numbers from 2005 for number of stores and SKUs per store at https://corporate.walmart.com/newsroom/2005/01/06/our-retail...: multiplying out the number of stores with the average number of items carried gives you 500 million price tags.


It's Samsung. They love vertical integration and they have their own fabs. They probably rolled out their own internal microcontroller across all their tiny gadgets, so as not to be dependent on external suppliers.


They do use this same one in some industrial lighting controllers for buildings


$0.10 for a gadget like that is huge.

I work on the main chip of a gadget that cost thousands of dollars with high margins on the end product, and our customers don’t like it one bit when we require an additional external component that increase the BOM cost by $0.5.

There’s a person somewhere in the supply chain who’s job it is to question the cost and necessity of everything. If you don’t have that, engineers get frivolous quickly.

(It reminds me of the case, many years ago, where they asked me if we could swap a crypto chip with dedicated keys by a cheaper generic version that cost 1 cent less.)


10c is still too expensive for this type of mass production gadget. I would expect to see it priced at below 1c/unit.

10c is about the BOM cost for each of this price tag.


I work on these kinds of tags and they're actually quite expensive, 10c is about the cost for the battery alone.


> 10c is still too expensive for this type of mass production gadget. I would expect to see it priced at below 1c/unit.

Is there any evidence for this? what volume do you mean by "mass production gadget"?


I highly doubt if any real data is published all of them are commercially sensitive and are under NDA. Mass production in this context is at least 1 million units.


Could you point to any data, circumstantial or otherwise that indicates a 1 cent device of this type? Perhaps you could elaborate on why you believe such a price point would be achievable or even offered. Otherwise, I'm sure you can understand how the rest of us might have difficulty evaluating the veracity of your claim.


EEVblog has a video of what could be possible. He is talking about 3c part. With direct negotiation at massive volumes, sub 1c part is definitely possible.

https://www.youtube.com/watch?v=VYhAGnsnO7w

What has your experience been with the very cheap parts?


There are MCUs that cost less than 1c


Sub-cent is the realm of 4-bit mask ROM bare-die-only parts with a few dozen bytes of RAM, not an 8051 with 64k flash and integrated 2.4GHz radio.


Micros with radios aren't that cheap. They are all rather unique too.


Perhaps as a kind of dongle, to prevent their product from being counterfeited?


Thanks for the share!




Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: