This is very cool. I am a big believer in tools analyzing a sequence of heap snapshots to automate the time-consuming parts of manual debugging.
I am working on doing something similar in C++, where there is an additional obstacle of recreating the objects from a memory snapshot without runtime support.
Thanks! JavaScript snapshots make this feasible because the runtime exposes a complete object graph — types, edges, arrays, strings, everything.
Doing similar work in C++ is on a totally different level: raw memory, no type info, pointer chasing, layout inference... Very curious to hear how you’re
approaching it.
debugging information is more precise than line numbers, it usually conveys line and column in a source file.
Some debuggers make use of it when displaying the current program state, the major debuggers do not allow you to step into a specific sub-call on a line (e.g. skip function arguments and go straight to the outermost function call).
This is purely a UI issue, they have enough information. I believe the nnd debugger has implemented selecting the call to step into.
Addr2line could be amended. I am working on my own debugger and I keep re-implementing existing command line tools as part of my testing strategy. A finer-grained addr2line sounds like a good exercise.
Our exact context here is not just column numbers, but also about backslash line continuations joined by the C preprocessor. That makes the #line directives emitted refer to columns within a (large) "virtual line assembled by the tooling", not an "actual source" coordinate.
So, a column number would not be very meaningful to a programmer (relative to some ';' or '{}' expressional label leveraging internal language syntax/bracketing which would definitely still be a bit to muck about with). As per my Lisp mention, it is really be a >1 dimensional idea, and there are various ways to flatten/marshal that parse tree. "next/over" and "step/into" are enough "incrementally/dynamically/interactively" to build up that 2d navigation, but also harder to work with "cumulatively" and with more complex than lisp grammars. Maybe most concretely, how "subexpression numbers" (in addr2x or other senses) are enumerated might still be a thing programmers need to "learn" from their "debugger UI".
Another option might be to "reverse preprocess it" or maintain forward-meta-data to go from the "virtual line column number" back to the "true source (line,column)".
I don't mean to discourage you, but just explain more what problem I meant to refer to by "how to label it" and highlight limitations of your new test. { But many are probably limited somehow! :-) }
I got tired of debugging the same kind of bugs over and over again. I can't make a dent in the overall quality of the legacy code base, so I decided to start writing a tool to automate the boring parts of debugging memory corruption and memory leaks (no LLM involved so far).
When you try to solve one problem involving two objects in three-dimensional space, you have a six-dimensional problem space. If you have two moving objects, you have a twelve-dimensional problem space. Higher dimensional spaces show up everywhere when dealing with real-life problems.
We usually don't talk about "the dimensions", we talk about the general case: n-dimensional spaces (theorems covering all dimensions simultaneously) or infinite-dimensional spaces (individual spaces covering all finite-dimensional spaces).
Of course, when you try to generalize your theorems you are also interested in the cases where generalization fails. In this case, there is something that happens in a 2-dimensional space, in a 6-, 14- or 30-dimensional space. Mathematicians would say "it happens in 2, 6, 14 or 30 dimensions". I never noticed that this is jargon specific to mathematicians.
Problems in geometry tend to get (at least) exponentially harder to solve computationally as the dimensions grow, e.g. the number of vertices of the n-dimensional cube is literally the exponential of base 2. Which is why they discovered something about 126-dimensional space now, when the results for lower dimensions have been known for decades.
But that's not how the article says it. It says "in dimensions 2, 6, 14, 30 and 62" instead of "in 2,6,14 or 30 dimensions". The later sounds fine, but "dimensions 8 and 24" to me sounds too much like something is happening in "8th and 24th dimension". It even uses singular "dimension 126" as if you took >=126 dimensional space, ordered the axis and something interesting happened along 126th and only that one.
Yeah, that's not what that means. In math "dimension" is used as a statistic. As in, "this manifold has a dimension of 4". So you can say things like "in dimension 4" to mean "when the dimension is equal to 4". We do also say "in 4 dimensions"; it just varies. The two phrases are equivalent. There is no ordering of dimensions or anything like that.
you didn't quote the "In". With the "In" it's usual math jargon that means
> "in dimension 4" to mean "when the dimension is equal to 4"
But the title has no "In" and it sounds very weird, perhaps even incorrect. Anyway, note that most of the times the title is not written by the author.
I know how to see the source in my debugger and I know how to see the disassembly. Which IDE will show me the compiler's internal IR if I want something in between?
I felt under-challenged last year and realized that it is because I spend too much time debugging and not enough time writing software. So I did the obvious thing and spent the holiday season writing a new debugging tool instead of playing video games.
I am working on doing something similar in C++, where there is an additional obstacle of recreating the objects from a memory snapshot without runtime support.