In general you can't really do bga with osh park without fudging their specs significantly. Some have been able to do it but you tend to end up with peeling traces. Also, don't forget shipping on all the bits and pieces.
For cross platform interoperability, API with the exact size type helps remove any ambiguity. Using size_t might be fine for intra-process usage, but as soon as we are dealing with data across platforms, exact size type definition is a must.
I see it the other way around. How many bits you need to address something in memory depends on the platform. Thus `size_t` is the only cross-platform type you can use. A fixed-size integral is going to work on some, but not all.
Sounds like you are mixing up you data's in-memory representation with their storage/transmission representation. This is risky business.
If you have no requirement that says otherwise, you should have an explicit marshalling and demarshalling steps that transform your live data objects into opaque BLObs. It would be highly desirable if your BLObs have some header that contains metadata to be used exclusively for marshalling purposes, at the very least size of the payload, object type id and format version id will save you lots of trouble.
Now what happens if you need high performance and are willing to trade of code complexity for faster execution. You can just copy your native object's bytes into the BLOB payload, just as long as you can correctly identify the source platform's relevant characteristics in the header. Then when the target host does the demarshalling step, it can decide if the native format is compatible with it's own platform and just copy the payload into a zeroed buffer of the correct size. If that its not the case, it will have to perform and extra deferred marshalling step to put the payload in "canonical" format prior to demarshalling proper.
You can even make the behavior configurable, so that customers running an heterogeneous environment do not suffer a performance hit for the sake of the customers in homogeneous environments.
Of course the data in storage or over the wire needs to be marshalled and unmarshalled (whether explicitly standardizing on a particular wire format or with header based hacks or whatnot). That's not the point.
The point is that a lot of the times, the two machines on either end of the wire need to agree on sizes of various fields you're sending (say in protocol headers). And then you want to work with that data internally in the code on either side. You better be absolutely sure how many bits you have in each type that you're allocating for these purposes.
And going even beyond that, very common, use case -- a lot of code reads cleaner and lends itself to debuggability when you know the exact sizes of the types you're using. It's not something reserved for just network programming.
Sorry, I fail to see the point in your second paragraph. Of course in the business logic level you need to allocate variables that can hold every possible value in the valid range, but as long as this is the case, why does it matter that you use types that have the same byte size in every possible platform?
In your third paragraph, i agree on the debuggability front (if you are actually reading memory dumps, otherwise, why should it matter). About the code reading clearer, I guess this is more a matter of taste.
It matters because of code readability, debuggability and all sorts of code hygiene reasons. If I'm using size_t for a field in my protocol on a 32 bit platform on one end and 64 bit platform on the other, which size wins over the wire? Can that question be answered while in debugging flow trying to track down a memory stomping error?
There have been some minor additions, he said: the J2 adds four new instructions. One for atomic operations, one to work around the barrel shifter, "which did not work the way the compiler wanted it to [...]
Is so intriguing! Does anyone know what was wrong with the original barrel shifter design? I tried reading up on it but failed to find much reference material. I followed the link to the J-core community site to read the code, but it wasn't immediately browsable, just available for download.
I assume there were compilers for SuperH back in the day, didn't they use the shifter? Why not fix the compiler to teach it the existing instruction, rather than adding an instruction just for this? How wrong can a shifter be, really? The questions just heap up.
Compilers did use the shifter. I don't know if this is exactly what he was referring to, but one oddity with the SH4's dynamic shift instruction is that it only shifts to the left (there are also a limited number of shift-by-small constant (1,2,8,16) amount instructions). To shift to the right, you have to first negate the shift amount, then preform a left shift. So if use did a right shift by a non-constant, you would always see a negation of the shift amount before the shift. My guess as to why it was implemented like this was that since the SH4 had a fixed length, 2-byte instruction set, running out of possible instructions for future expansion was a real hazard, and not encoding both directions was done to save space.
On the original SH4 implementation, under certain conditions, there had to be one cycle in-between when a shift-amount was generated and when it was used, otherwise there would be a one-cycle CPU stall. A real right shift would avoid the need to schedule around this stall. This isn't necessarily something that needs an extra instruction to fix, the implementation could be designed to not need the stall, but it might difficult to work around. I don't to circuit design, but dynamic shift instructions typically look at as few bits in the shift amount to simplify and speed up the design of the shifter. The reason for the delay in the original SH4 is probably because it analysis and tags each register with information for the correct shift direction and amount, and certain units won't have this information ready for the shifter in time, hence the stall if the shift is too close the shift amount generation. (I've read this certain CPU implementations have done similar work in tagging if a register is zero or not, in order to help keep branch-on-zero/not-zero instructions quick.) If the instruction talked about is a dedicated right shift, it could be defined in a way that doesn't need a negation and extra tagging, would be much more compiler friendly, and faster.