IMHO when debugging software needs to be automated to the point that a library needs to be written (and itself debugged), there's an underlying problem which can't be solved with adding more layers of complexity - as that will only introduce more bugs. When these bugs are in the software you're using to debug, things can quickly take a turn for the worse.
Is that code really taking the user's input (a string), getting object addresses and offsets (numbers) from that, then converting those into strings to build a command string, which then gets parsed back into numbers for the debugger to ultimately use to create a watchpoint? I think that is itself a good example of how the "more code, more bugs" principle can apply: all this superfluous conversion code has introduced a bug.
Here's a good article about that, although it doesn't mention the situation where the bugs you introduce end up being in the software you need to use to remove bugs...
I've hacked plenty of GDB. It's just a program like anything else. Why wouldn't you want to debug it? There's nothing mystical or special about it. When you combine th features you describe with code (which are, in fact, quite non-trivial) with code to look up symbols, deal with remote gdbserver instances, papering over extreme differences between operating systems and executable formats, you end up with a very complex program that has the same maintenance needs as any other.
Nobody discuss about mystical or special in debug a debugger, but everything is about to invest lot of energy and time. Even launching GDB is a pain. And theres's a reason why many research live debugging using Smalltalk. When you debug a debugger you want as much as reflection as possible and Smalltalk has reified almost all of the internal machinery (see for example Moldable Debugger).
What is the underlying problem you're referring to?
It is admittedly a rabbit hole when you're writing code to debug code. But isn't that the whole spirit of writing tools to make your life easier and more efficient? We pile new abstraction layers atop more and more abstraction layers, and then we get computing in its impressive form as it exists today.
GDB and LLDB and any other debugger are huge software libraries for debugging, and yeah they might introduce bugs as well, but that does that mean they shouldn't be used or that they're not useful? I find it quite useful that LLDB has a scripting interface to automate debug sequences I find myself doing over and over. And since there is a scripting interface, we can find open source libraries for common debug tasks so when there is a bug, the bug is shallow due to many eyes.
IE given C++, or objective C, what gets described to the debugger by the compiler requires the debugger to know and do a lot to get actual values out of.
For C++, it's actually pretty good except function calls/etc require understanding the ABI. IE the debug info i get tells me "if i want to get the value of this variable, i evaluate this expression". I know the layouts of how to interpret it, etc. It's rare the expression is too complicated (though it may require piecing together registers and memory, etc, it's jsut a state machine).
For Objective C, even things like "instance variables" require the debugger understand a lot.
Java has a fairly reasonable agent, etc.
Part of this is that the type systems of the debug info formats (DWARF, etc) are very simple, so even though they theoretically support things like function calls, etc, it's rarely used to provide the functionality necessary, and the debugger is left having to do it itself.
* module-differentiated references (since foo.dll!globalThing can be different from bar.dll!globalThing)
* scope-differential references (each compilation unit can have its own statics)
* CPU registers (which make perfect sense as variables in interactive debugging)
* convenience variables
* preprocessor macros (which we record in DWARF these days)
* pine number references
All true (I maintained c++ support in GDB for years, so i'm sadly aware of most of these issues), but parsing is a user interface issue (IE "What is the user asking me about"), rather than a "how do i actually access the value the user asked me about". You assume, strongly, that the user wants to use the same expressions that exist in their program. Let's assume this is true for a second: Good solutions for this already exist (libclang, etc) in most languages to abstract the "what is the user asking about" part, no good solutions exist for a lot of languages to abstract the "how do i access to the value of that in this implementation"
(This is an "in practice problem". In theory, you could pretty easily extend DWARF to tell me how to call functions in C++, for example).
" Using some kind of "agent" embedded in debugged programs as a necessary part of debugging is unacceptable, since you're frequently debugging core files and minidumps and you can't exactly put a question to a corpse."
First, i'm going to challenge this. It may be true in what you do.
However, at least in the development environment in which i function, in C++, debuggers are a tool of last resort (i literally have per-line command logs of what developers where i work do with the debugger).
The number of times they are run on core files is < 5%.
This is >25k developers.
Given the vast majority are not debugging core files, ISTM to make more sense to have an architecture targeted at serving these 95% super well, and then handle the 5% of cases differently
(I expect, when you are that screwed, that you may need a different set of tools to be effective anyway, since core files are post-mortem debugging).
Second, you make the strange assumption an agent can't read or work with core files, and needs a live process?
"The debugger needs to understand how to do all of this itself."
You assert this rather than show this.
What stops an agent from having an interface to read from memory (most in fact, do), and the callback lets the debugger give it memory from the core dump or the host?
This is in fact, what already happens in remote debugging of core dumps ....
Maybe I wasn't clear --- I think we agree on this point. Debugging parsing is different from (and in some ways, harder than) regular compiler parsing because users want to use familiar syntax that is different from regular program code. I can write print/x $pc+4 --- no C compiler interprets "$pc" to mean the program counter.
While it's true that it's a UI issue, this classification doesn't make the problem any easier.
> you could pretty easily extend DWARF to tell me how to call functions in C++
What extensions would you add? We already have stack-layout information, and the debugger implicitly knows the platform ABI.
> C++, debuggers are a tool of last resort
I've seen this phenomenon too, and it's upsetting: debuggers can be much more efficient. I've put a lot of work into making end-to-end debugging seamless, but I still see developers using traditional in-code tracing.
I want to try making Mozilla's rr available in an equally easy-consumed package and see whether the ability to reverse debugging begins to sway people.
> Second, you make the strange assumption an agent can't read or work with core files, and needs a live process?
I think we mean different things by "agent". I was talking about a remote stub that lives in the process to be debugged. If you instead move that logic to an pluggable component that the debugger merely hosts and that it uses as a general abstraction of how to debug targets of various sorts, debugging core files is feasible. (But in that case, how is it different conceptually from struct target_ops, which we already have?)
But it does not know the C++ ABI.
Here is the minimum amount the random crap GDB currently has to understand, on it's own, about the GNU v3 C++ ABI:
(There's more, it's just not all in this file :P)
In an ideal world, the debugger should need to know none of this. It should be part of the debug info.
(The Delphi IDE integrates its compiler with the debugger in this fashion.)
But the pull request he created (https://github.com/facebook/chisel/pull/117) didn't fix that; instead it just replaced that function call with manually parsing the literal in the one specific place he was having problems with.
Did I miss something there? That seems like a really weird "solution". Why not just fix the original function?