So many potential pitfalls to string functions. But memcpy and friends can have pitfalls too.
I was working on a RISC processor and somebody started using various std lib functions like memcpy from a linux tool chain. I got a bug report - it crashed on certain alignments. Made sense - this processor could only copy words on word alignment etc.
So I wrote a test program for memcpy. Copy 0-128 bytes from a source buffer from offsets 0-128 to a destination buffer at offset 0-128, all combinations of that. Faulted on an alignment issue in code that tried to save cycles by doing register-sized load and store without checking alignment. That was easy! Fixed it. Ran again. Faulted again - different issue, different place.
Before I was done, I had to fix 11 alignment issues. A total fail for whomever wrote that memcpy implementation.
What was the lesson? Well, writing exhaustive tests is a good one. Not blindly trusting std intrinsic libraries is another.
But the one I took with me was, why the hell isn't there an instruction in every processor to efficiently copy from arbitrary source to arbitrary destination with maximum bus efficiency? Why was this a software issue at all! I've been facing code issues like this for decades, and it seems like it will never end.
>why the hell isn't there an instruction in every processor to efficiently copy from arbitrary source to arbitrary destination with maximum bus efficiency?
Uh, you're not an hardware designer and it shows.. What if there's a page fault during the copy, you handle it in the CPU?
That said, have a look at RISC-V vectors instruction (not yet stable AFAIK) and ARM's SVE2: both should allow very efficient memcpy(among other things) much more easily than with current SIMD ISA.
Do they manage alignment? Say a source string starting at offset 3 inside a dword, to a destination at offset 1? That's the issue. Not just block copy of align register-sized memory.
Page fault is irrelevant. It already can happen in block copy instructions.
So, no, they don't have anything like an arbitrary block copy that adjusts for alignment. Not surprising; nobody does. So we struggle in software, and have libraries with 11 bugs etc.
I was working on a RISC processor and somebody started using various std lib functions like memcpy from a linux tool chain. I got a bug report - it crashed on certain alignments. Made sense - this processor could only copy words on word alignment etc.
So I wrote a test program for memcpy. Copy 0-128 bytes from a source buffer from offsets 0-128 to a destination buffer at offset 0-128, all combinations of that. Faulted on an alignment issue in code that tried to save cycles by doing register-sized load and store without checking alignment. That was easy! Fixed it. Ran again. Faulted again - different issue, different place.
Before I was done, I had to fix 11 alignment issues. A total fail for whomever wrote that memcpy implementation.
What was the lesson? Well, writing exhaustive tests is a good one. Not blindly trusting std intrinsic libraries is another.
But the one I took with me was, why the hell isn't there an instruction in every processor to efficiently copy from arbitrary source to arbitrary destination with maximum bus efficiency? Why was this a software issue at all! I've been facing code issues like this for decades, and it seems like it will never end.
</rant>