this is awesome…
I'll be the first to comment on the apparently misplaced bounds check(!) in the fragment of code above it; it reads a parameter from the stack, compares it with 8A, then makes it an index into some array of 8-byte elements and reads the two values from memory before deciding whether it's valid or not --- and seems to put -1 into eax if it's not.
Not really a problem if this is running in realmode (or "unreal mode") with no memory protection (it will just read 8 bytes from somewhere in the address space, and probably ignore them), but it could crash if it was in protmode (which the lgdt in the preceding fragment suggests) set up with restrictive segment liits, and the memory address was not valid.
Then again, the check could be completely superfluous if that function would never be called with an out-of-bounds value...
The motivation for the change is that Intel's making the baseband instead of Qualcomm now, so it's not an ARM/Hexagon like it once was.
The author is just bent around a decade plus out of date view of chip architecture.
Additionally given that this is a very low power low level embedded system device for a specific function it seems most likely that it's also, like Intel's old Atom chips, a refined derivative of older simpler x86 chips with in-order execution, no speculative execution, no µop transforms or other stuff like that. Would be kind of interesting to learn some of the details, but at any rate that means whole classes of security issues people are already bringing up in this thread (like Spectre/Meltdown) are entirely irrelevant. If some arch simply lacks speculative hardware period then that's that. Simplicity in general can have major performance impacts sure, but it's also lower energy and more secure.
But no, you're not safe because crappy USB drivers can probably be exploited if they're expecting the USB device to be playing nice.
A USB-C/USB3 physical port makes things a bit more complicated, as it also could be attached to a controller supporting Thunderbolt, which is PCIe.
The unsafe part of plugging in a USB device is probably at the OS level rather than the protocol level - in my opinion you're much more likely to be owned by something like connecting a device with a buggy or compromised driver, mounting a filesystem containing an FS exploit, or reading a file containing an application exploit than by a USB-level protocol exploit.
Finding a USB protocol level exploit would be like finding an exploitable bug in how a NIC does scatter/gather of packets. Like, notings impossible, but damn.
You still shouldn't stick random unknown USB devices into your computer though.
PCI-E does DMA, and AFAIU was inspired by Infiniband (another RDMA protocol).
Guess Intel got the CDMA assets via its purchase of VIA Telecom assets in 2015. Wonder why it’s taken this long to come to market?
Thanks a ton!
Also I'm a bit baffled that they wrote a tool to measure the entropy of the machine code, tried hand-disassembling and considered that it might have been an encrypted binary format before guessing that an Intel chip could be running an Intel CPU. But they did try "EVERY POSSIBLE RISC ARCHITECTURE [they] KNOW" because apparently nobody ever used CISC on embedded devices. Nobody tell him about the GameBoy.
Of course I'm a bit harsh, it's easy to mock in hindsight but it's still not very interesting technically.
moreover, all previous iterations of Intel basebands were custom ARM cores based around Infineon IP acquired by Intel to be competitive in the baseband market...you did not even read my document, because I said this about the old baseband version
moreover, by the nature of baseband itself, it requires a CPU capable of real-time or near real-time processing, as a matter of fact other vendors are using Cortex-R CPU, which is an ARM cpu made for real-time os, giving you predictable timings, especially interrupt processing and memory access
for example, Cortex-R gives you a special kind of memory, called TCM (Tightly-Coupled Memory) memory, which gives you predictable memory access timings, something that you cannot obtain with a simple cache
by the way, Cortex-R is also used in WiFi chipsets, because the type of processing required is very similar (check the excellent writeup done by Google's Project Zero about this)
so yes, it is interesting to see how Intel managed to implement this kind of features in an x86 CPU, which was never designed for such kind of requirements
I suggest you take a look at the References in my document, they might provide some useful information on the matter
of course if you're not interested in baseband reversing, then I guess you're right, it's not technically interesting material
they were Hexagon only just for few models (iPhone 5 and 5s I think), before that, they were using the Infineon baseband, which guess what...it's what Intel bought :)
btw, for the last 10+ basebands were mostly ARMs, with very few exceptions (the already mentioned Hexagon), check also Mediatek and Huawei basebands
and yes, I don't like having an x86 as an embedded CPU, but that's my problem, I guess...
But, that being said, these things probably have a "Mobile Core" processor (i.e. an Atom), which are quite low powered still, and they probably don't run them at all that high of a clock rate either, saving more power.
Simpler instruction set means fewer gates means less power draw, with some hand waving. The ARM 1 had 25,000 transistors and the ARM 2 had 30,000. The contemporaneous Intel 386 had 280,000 and the 486 had 1,200,000. Intel had the engineering resources to design powerful chips that people bought in droves, ARM had a very small number of engineers and had to design something smaller but as a consequence the ended up with very low power consumption, which wasn’t important until people put it in phones. Since then Intel has optimized for power consumption but they were catching up for a long time.
In most modern CPUs, other factors dominate and instruction set is less important. Nowadays we have plenty of gates to spare and the question is how much can you accomplish at a given price and power envelope. And it’s irrelevant to talk about power consumption of a baseband processor unless you’re also talking about the power consumption of the radio, since they get used together.
From the conclusion:
"The result demonstrates that the decoders consume between 3% and 10% of the total processor package power in our benchmarks. The power consumed by the decoders is small compared with other components such as the L2 cache, which consumed 22% of package power in benchmark #1. We conclude that switching to a different instruction set would save only a small amount of power since the instruction decoder
cannot be eliminated completely in modern processors."
But I think the paper is a bit narrow in scope, relative to the discussion here. For any given application, you want to find a cheap part that can run that application. If you can run the application on an 8051 with a few K of ROM, then you can save a lot of money and reduce power consumption by switching to the 8051. If you need a powerful DSP to do some SDR for your cell phone radio, you're going to pick a different instruction set and pick a part that draws a lot more power.
I think the paper is taking as fixed the part functionality and considering how the encoding can be changed, but for a core running code that is not user accessible, the engineers are free to choose a core with functionality that suits their particular needs (which you can't do with the main CPU).
What's surprising to some people is that Intel has successfully scaled down their x86 cores so you can use them as embedded cores in larger ASICs, essentially. You can reuse a successful core design like the Pentium 4 and adapt it to a modern 14nm or 10nm process and you end up with something cheap and easy to use. 10 years ago that wouldn't have worked. Even recently it was much more common to use dedicated DSPs everywhere, but these days I feel like people are ditching the DSPs for cheap and ubiquitous general purpose CPUs and microcontrollers.
The Pentium 4 is not a technical success. It was an inflexible hot rod with a compromised architecture driven by marketing. That is why its architecture was abandoned.
The 486 started out on a 1000nm process. Shrinking it to a modern process and enhancing it with modern goodies like a larger cache is comparatively easy.
I'm curious where you're getting this information... from what I understand the P4 has stuck around.
Doesn't Intel pay license fees on every ARM core sold?
EDIT: Oh, and load-op-store instructions make the level below superscalar a bit harder to design at least. And x86's strong versus ARM's weak memory ordering guarantee have effects even in OoO-land though which one is better is a very complicated issue I'm not going to try venturing an opinion on.
I think you have vastly underestimate the amount of processing power required for modern days 4G LTE computation requirement. Even with an Modern DSP.
Just an interesting find/factoid more than anything earth shattering or important.
Chances are this isn't even a true x86 CPU, just some ad-hoc smaller version of it. Wonder what cpuid returns on it :)
If you're shipping one of the highest-dollar chips in the iPhone, I'm sure cutting out ARM is one of the juiciest moves possible. That would have been a very big target for them.
That said, modern ARM architectures are also quite complex and calling them RISC is a stretch.
Now, the fact that an x86 instruction stream isn't self-synchronizing can represent a danger in theory. That is, you can craft a sequence of x86 instruction such that if you start executing form 0x0000 they're safe and friendly but if you start executing from 0x0001 you'll see a different, equally valid, stream of instructions that might do something malicious. But that doesn't seem like a credible attack vector in this case given the embedded nature of the code.
Memory-aligned means that instructions can only exist on certain addresses. (It also implies a minimum instruction length.) On most RISC architectures, instructions are also a constant size, with the same alignment. For example, a MIPS instruction stream consists of 4-byte instructions aligned to every fourth byte in memory. Memory alignment and constant instruction size ensure that any given instruction stream can only be executed as itself, or part of itself. Note that this is not self-synchronization - if you removed a byte from an instruction stream, you would get a wildly different instruction stream.
(I said MIPS because ARM instruction streams can be THUMB, which permits a different instruction interpretation for the same stream, and thus defeats the security advantages of memory alignment requirements.)
Self-synchronizing means that each subdivided unit (usually byte) of a stream also indicates if it is the start of a new symbol within that stream. For example, UTF-8 is an example of a self-synchronizing byte stream. In UTF-8, encoded codepoints have variable-length representations. However, the upper bits of each byte clearly indicate not only if the byte is the start of a new codepoint, but how many bytes follow to complete the codepoint. This means that a deleted or altered byte will only delete or alter the symbol it's a part of, and not any other part of the stream.
Self-synchronization and alignment requirements address the same problem - ambiguous instruction streams - by different methods. Memory alignment prohibits you from reinterpreting the stream; while self-synchronization makes doing so less valuable.
It reminds me very much of prefix-freeness. Is it fair to say that that's what we really need (without alignment)? It seems self-synchronizing is a bit stronger (no pun intended) but not necessarily necessary to ensure you can't jump into the middle of an instruction?
This encoding is presumably a consequence of the instruction set containing conditional skip instructions. That is, the skip instructions don't have to parse the instruction stream for 2-word instructions but can always just skip a single instruction word and let the second word execute as a NOP.
Spectre affects almost all CPUs. It affects ARM CPUs.
But Meltdown only affects x86 CPUs. 
Maybe Intel used such an old core design in their baseband that it's not affected? Nah, that'd just be too good to be true.
Also worth noting AMD x86 CPUs were not affected. Meltdown was definitely more a "how good was your implementation" question not a "are you x86" question.
Yet, as of right now Intel as not fixed in silicon all spectre/meltdown issues.
Given the R&D / QA time on a specific arch, I'd bet it is still hardware flawed if they took an of the shelf in house x86 design. Atom based maybe?
So I do not feel confident for a chip that runs the broadband and moreover one that is not auditable by the end user (unlike a main CPU would be).
I would be supper interested in knowing if the reverse engineering show traces of retpolines.
would have been much more fun if they switched to PE format though, like they did with EFI/UEFI :D
These seems like a lot of histrionics for a summary that reads:
>"Conclusions Nothing really, I just found this funny and wanted to share."
This is also the first time Intel has their baseband modem manufacture in their own 14nm Fab.
This is also the first time any Intel modem had support for CDMA / TDS-CDMA.
This is also the first time Intel made x86 into their baseband modem. All previous modem, 8 years since Infineon has been sold to Intel, has all been ARM based.
So yes, lots of hope for this new Modem. And finger crossed Intel don't mess this up.
Seems very unlikely, but from product perspective it’s one (and maybe only) of the new features that’s related to baseband.
1) Intel somehow is doing CDMA now, that’s a major reason previous generations were split between Qualcomm and Intel
2) The major reason Intel has a seat at the table is due to Apple and Qualcomm‘s very public fight. Intel and Apple don’t have an entirely happy relationship, but it’s a far better relationship than Apple and Qualcomm
For example here is a "program" except it's really a meme gif off my phone. https://imgur.com/gallery/hoDKeC9
It's crystal clear that the gif from your phone is gibberish (or extremely obfuscated), whereas the code from the article is normal-looking.
From the comment right under the picture I took
> yeah, pretty typical function prolog, what's the question ?
Except we know it is not.
I am more saying to people be careful pushing any old binary blob through capstone without considering what it might produce, I get this at $DAYJOB where people disassemble VAX from things that are just data.
The main thing the x86 instruction set has going for it, is backwards compatibility. (Including the fact that there are a lot of highly optimized CPU designs around that instruction set.)
Also, early Nokia 9000 Communicator had x86 CPU. I think it was a 386. Mobile returning to x86 instead of going to x86.
Will show a graph of the file offset vs shannon entropy.
And as the original iPhones used Infineon chips (which Intel acquired), depending on your perspective, you could twist it and say they’ve been there since the beginning. Bit disingenuous to me, but I could see someone making that claim.
That's a very antiquated view of modern CPU design. Unless you're designing a coin-cell powered CPU the x86 decode stage is effectively free and can be disregarded. Besides: most ARM (and nearly ever other modern RISC CPU) does the same thing, decoding the instructions to internal micro-ops. x86 variable-length instructions vs RISC multiple instructions can be thought of as different instruction compression schemes and which one treats your L1-I cache better depends on workload.
CPU ISAs are effectively an ABI for hardware. Except for the extremely low end, no one directly executes the ISA anymore and hasn't for years.
I suppose if anyone were designing new ISAs they'd design them with that assumption, to avoid baking-in temporary implementation details they'd regret later (see: branch delay slots).
Yep. See RISC-V's decision making process.
FYI - Apple is a hard driving customer, they do not compromise on power, performance & cost/pricing
And does this mean it can run Doom?