Semi-hosting on ARM with Rust

pm215 · on Oct 19, 2016

Ah, semihosting; a bit unloved but very useful in some circumstances. Emulators like QEMU also support this interface, which makes it handy for random debug if you're using them as your test/development environment. The Linux kernel has a configuration option to use semihosting for its console output too.

PS: the article is not quite correct about when to use which trap instruction. Cortex-M profile CPUs should use BKPT; all other 32-bit code should use SVC (with an immediate value varying depending on whether the instruction is Thumb or ARM). 64-bit code uses a HLT insn. If you care at that level of detail you'll be looking at the documentation for the ABI anyway though...

_mbr · on Oct 19, 2016

I'll admit, I wasn't as accurate as I should have been. I've corrected the part, but left out the long treatise about 2 and 4 byte instructions. Hopefully no one will miss it.

brandmeyer · on Oct 20, 2016

Semihosting does more than just emulate the standard I/O streams. It also emulates the filesystem. I've used this feature to get gcov-based coverage analysis out of an embedded target. You see, gcov normally expects to be able to write its statistics out to files which are named after the object files. Of course, those don't exist on a cross target. But with semihosting, you can fake it well enough to work.

This also works for getting rich unit testing frameworks to work, too. Not only do they expect the standard streams to be available, you can also store a bunch of data-parameterized test cases for the unit under test in a file.

AceJohnny2 · on Oct 19, 2016

> println!-style (or printf-style for more C-affine readers) debugging can be immensely useful, a quick-fix that can save a lot of time that would otherwise be spent setting breakpoints or single-stepping through a program.

Funny, the very reason I use a debugger is that it provides a faster turnaround than modifying the code, compiling, flashing, and hoping I printed the right thing...

Manishearth · on Oct 19, 2016

For me, the main advantage of printf debugging is that it gives a lot of output at once. With a debugger, you have to stop at each breakpoint, print something, and go on. In case the breakpoint is in a hot path, this takes forever. With printf I can get a nice text file to skim through.

This doesn't mean I use printf debugging often, just that it makes sense in some cases and I use it then :)

Jtsummers · on Oct 19, 2016

With gdb, at least, you can set up breakpoints like this:

  break file.c:200
  commands
  silent
  printf "x is %d\n",x
  cont
  end

(NB: silent prevents gdb from announcing that it's stopping at the breakpoint, reducing the output noise.)

Now with gdb running the program, every time we hit that line we can get this output. With logging (set logging file <filename>, set logging on) you can get the output from the run's execution stored to a file.

Conditionals are also possible, so we only print out the results of every 10th iteration through a loop, for instance.

  br file.c:200 if (i % 10) == 9
  commands
  silent
  printf "Iteration: %d, X: %x\n",i,x
  end

(NB: Not at a computer with gdb, apologies if I miswrote something in the examples.)

monocasa · on Oct 19, 2016

Over JTAG, you're looking at ~200ms for each break which makes it untenable for a lot of embedded use-cases.

Jtsummers · on Oct 19, 2016

That's fair, I'm not using that for my work, however. Once you start getting into different extremes you have to choose the appropriate tool.

EDIT: We also have a system where we were running on embedded linux, but due to constraints simply couldn't get gdb involved. printf-style, implemented by way of a logging function where we could specify logging levels, was the effective solution for us.

Manishearth · on Oct 19, 2016

This is nice! Thanks!

Jtsummers · on Oct 19, 2016

Check out the pages in 'info gdb' on your system. Lots of good info (no pun intended) in there. Completely changed how I used it, mostly because I'd forgotten a lot of that stuff from the one undergrad course that integrated GDB into the course.

One of the true game changers in my current office's work. We're doing a lot of stuff (new-ish here) in embedded linux (versus self-hosted executables). It's a pain in the ass to load different versions of the software (by the way the rest of the system works, we can't load it on-the-fly because we will miss various timeouts that we can't control and the whole system will go into a failed state; can't load and save it because the OS is stored in nonvolatile memory that we can't directly alter without going into u-boot). For debugging this, we have the system load, and, before gdb executes, download just a specifically named file over the network containing these gdb commands for whatever we want to obtain. So you want to know if function foo ever gets called with a parameter of 4? Put that into the command file and reboot, run the tests and see the output in the serial console, download the log before rebooting again.

rkangel · on Oct 20, 2016

It's not just slow: if you're running anything doing a control application (like running a radio), if you hit a breakpoint the whole system crashes to a halt when you hit a breakpoint (because you're no longer servicing whatever you mean to service). This makes the subsequent stepping meaningless.

In this environment, a stream of debug output (printfs in the basic case) is unmatchable. Also, as you build them up throughout your project they become a great diagnostic tool allowing you to trace program behaviour and diagnose bugs just by looking at the output.

freehunter · on Oct 19, 2016

I use it almost exclusively to just print out what every variable is reading at any given moment to find the one spot where it's not within expectations.

I'm not a very good programmer...

simcop2387 · on Oct 19, 2016

Yea printf debugging can be a really nice way to do selective tracing like that. Sometimes that's more important than a breakpoint.

duaneb · on Oct 20, 2016

Stepping through code is only useful when you know where the code diverged from your expectations. If you proactively log well enough, you sometimes understand the state of the program well enough to bypass the debugger completely.

If hope is involved, you could have almost certainly saved the same time by thinking more carefully.

lisivka · on Oct 20, 2016

I am totally agree. It's easier to invest into code to catch and explain unusual situations than to waste time on debugger. I.e., instead of setting of breakpoint in a location, it is easier to write an assert with all necessary debug information (e.g. values of variables, path to files, name of units, etc.). Assert will work 24x7 instead of me. Moreover, error message doubles as a built-in test and a documentation. Asserts also will work in production, when debugging is not possible at all.

optimuspaul · on Oct 19, 2016

I've found that using a debugger will often hide certain race conditions. I'm biased towards logging rather than a debugger partially for this reason, but also I'm lazy.

radiospiel · on Oct 19, 2016

Depending on the configuration logging can also hide race conditions, since the OS usually synchronizes access to file handles and such.

taneq · on Oct 20, 2016

That depends hugely on what you're doing. Often yes, stepping through is quicker and easier. Sometimes, though, you're dealing with a realtime process which is interacting with outside hardware and there's timeouts going off on all sides, and it's easier to just log (or otherwise get data out without halting execution - even just ye good old blinky LED or storage scope on a spare pin.)

adamnemecek · on Oct 19, 2016

It depends on what debugger you use I think. Setting breakpoints in gdb is painful, it's easy in IDEs.

qznc · on Oct 19, 2016

I would need a multi-process debugger. The workaround with gdb is to attach one gdb to one process. That is tedious to setup and has little advantage over logging.

kukx · on Oct 19, 2016

Please consider using a thicker font. This one is not very readable on my device (Windows 10 + Chrome + 17" 100PPI display). I can read it, but it's not a pleasant experience. [edited]

_mbr · on Oct 19, 2016

It's Lato Light (Lato @ font-weight: 300) at full #000, it's not a particularly obscure choice I think. I've checked the rendering on anything-but-retina monitors (27" at 1920x1080) but it looked fine to me.

Unfortunately, font rendering is still a bit of a crapshoot, it differs from OS to OS and even varies wildly between Linux distros (which may or may not use potentially non-free freetype functionality).

In the end, I hope it's manageable. We're open to redesign offers though =).

lnanek2 · on Oct 19, 2016

That font is really thin and weird looking here on OSX too. I can read it, just not sure I want to.

_mbr · on Oct 19, 2016

Is it also unreadable at https://fonts.google.com/specimen/Lato ? (Try "Thin" at 18 px size)?

usea · on Oct 19, 2016

I'm not parent, but experiencing the same problem. Yes, it's also very thin at the google font site, on 18 px Thin. Here is a screenshot: http://i.imgur.com/bUe7gNs.png The arrow points to the "Thin" line.

Windows 10, Firefox 49.0.1

FWIW, I think it's a pretty common problem with many web fonts on certain platforms (windows?). I see it all the time.

Also, thanks for the post!

_mbr · on Oct 19, 2016

FWIW, I've changed the font-weight to regular. It's the same size on my non-Retina displays, but increased to regular thickness on the high resolution ones.

Hopefully, it's readable for by everyone now.

freehunter · on Oct 19, 2016

MacOS Sierra on Safari, looks fine to me. Does not match the screenshot provided in this thread.

comex · on Oct 19, 2016

    bkpt_n = int(raw, 16 if raw.startswith('0x') else 10)

Tip: you can accomplish something similar with 'int(raw, 0)'.

_mbr · on Oct 19, 2016

That's a neat feature I'll remember. Fixed as well.

xxr · on Oct 19, 2016

Haven't read the article yet, but the coauthor Philipp Oppermann has a really enjoyable guide I've been following on building an OS in Rust: http://os.phil-opp.com/

hornetblack · on Oct 20, 2016

> The "volatile" option indicates that the code has side-effects and should not be by removed by optimizations.

I think volatile also means don't change the order with respect to other volatile operations.