I want to preface this by saying I only add little bits and bobbles, and a huge chunk of the heavy lifting was done by aiju. Most of my answer is talking about code written by them.
Both gb and gba make an attempt to preserve the cycle timing for each instruction and memory access (ie some of the later portions of the gba cartridge rom have increased load times). For gb I have thrown a couple of those "acid tests" and it usually does pretty good, but I have yet to really sit down and extensively test. When I implemented the serial cable link over tcp for gb I did find there to be some desync in stuff like pokemon battles, that smelled like perhaps a timing accuracy issue but I have yet to really figure it out. My current theory is that the hand off between the cpu and the ppu are not quite accurate. Its cute code that uses some setjmp/longjmp magic, but perhaps not true to hardware.
To me the selling point of these emulators are not "these are the most accurate to hardware around" but more their general simplicity of implementation. They are really quite compact for what they do, almost all are just a couple thousand lines of C.