Source: I invented Source Maps
The original source map format (v1) was created by Joseph Schorr for use by Closure Inspector
Username checks out. Snark or not, please avoid posting this type of comment here.
There's a 50/50 shot they won't even work at all, and of that 50% of the time they do work, they often don't have the correct line and column.
Again, things like the interactive debugger and "pause on caught exceptions" often casually break with source maps.
In short, it's a mess.
Not to mention several Chrome bugs about sourcemaps over the years.
I've worked on many projects for large companies and I have to agree with the previous commenter that it feels like source maps often don't work correctly.
You can blame the bundling tools, you can blame the deployment process, you can blame the developers... But ultimately, source maps are not so straight forward and they slow things down because they add a lot of unnecessary complexity to the project.
I don't think I've ever worked on a CoffeeScript or TypeScript project without encountering at least a few major recurring issues with the source maps. When you have many developers working on the same code, things will break in unexpected ways and so you want to minimize system complexity.
My intuition is that while these all solve real problems, they are complex enough solutions that they themselves have problems over the course of time. Each tool and configuration of the tool individually understood by whoever put it into the project but never understood by all team members. Two things come out of this: when a tool breaks, it can't be fixed by everyone, and because it can't be fixed by everyone no one feels responsible and it stays broken or gets fixed ad hoc by each team member until a real fix is made by someone who can (doesn't always happen).
By the time you are a few years into a project your tooling is a mess, people that built it have left, and dependencies and the ecosystem have well and truly moved on.
I loved Vagrant, until it broke, on every single project I touched after it was initially implemented and in a myriad weird and wonderful ways.
I know these are complex tools, but they often don't feel worth the trouble. They are not robust, they pass out of fashion quickly, and they are rarely effective enough for long enough to truly reap rewards (in my experience).
I want these things to work. But they are broken and in the way so often that I prefer not to rely on them. If a builder had to open his magic drill and learn an unfamiliar skillset to fix it every few months he would probably go back to the regular drill that just worked.
Whether that was trying to get source maps to work when compiling + minifying via Google Closure and Webpack, or using Babel and Webpack back when the underlying source map util (I forget the name) didn't support certain things and having to fudge it myself (I forget the specifics).
They worked, but they were fairly brittle.
Now, tooling seems to have been improved a lot and using things like `angular-cli` means I don't have to do any of the setup anymore and I haven't had these kinds of problems in a really long time.
Ok, the delay from output to showing the source map is still an issue, but it's not one I find a problem. A few seconds after page load is acceptable to me before going digging through the source, and any breakpoints set in the source files continue to work.
Overall, for me, the weak point wasn't the source maps themselves, but the idiot setting up the build tooling to make them possible (me). :)
I've never encountered these problems, have I not done real web development?
Some time ago I took a look at various transpilers for ML-like languages: OCaml->JS, Elm->JS, Haskell->JS, etc.
I worried about how good the source maps would be, because good source maps are non-trivial, and it is no fun to debug in a different language than your source.
To my astonishment, those transpilers provide either rudimentary source maps, or no source maps at all! And I didn't find any significant push towards better source maps. I wondered how this is possible, given that the communities around ML-like languages place great emphasis on high-quality code with as few bugs as possible.
Then I got it: This is just another instance of "If it compiles, it almost always works". Imagine client-side code where no type of server response can be ignored, no unexpected null or undefined can slip into your data structures, every function or method you call will 100% surely exist, and so on. Think of TypeScript or Flow, but on steroids. That's what these languages and their transpilers offer.
Well, at least that's the theory. It would be interesting to have reports from people actualitty writing larger web applications that way, and in how far these promises hold.
When you start binding to JS libraries, all hell break loose and you have to track down undefined and nulls that slip into places they shouldn't, because most JS libraries are poorly written.
I only have experience with js_of_ocaml, which has fairly good support for source maps .... after a painful setup phase. It does help a lot for bindings though.
Wow. That must have changed in the meantime.
Bucklescript seems to be the way forward, and much better than js_of_ocaml. But that's hearsay. Anyone with actual experience in bucklescript?
> you start binding to JS libraries, all hell break loose and you have to track down undefined and nulls that slip into places they shouldn't, because most JS libraries are poorly written
If the interface to that JS library is not performance critical, couldn't you just setup a thin wrapper and communicate to the library via JSON structures? Then, your OCaml code could treat it like any other (potentially buggy) external source.
Note that there are a few types of source maps that vary in quality, so sometimes you don't get accurate column info. http://webcache.googleusercontent.com/search?q=cache:qYxqOqD...
The cool thing is, it's really easy to throw together a toy language with first-class debugging and tooling, thanks to sourcemaps. See coffeescript and lightscript for two ecosystems to emulate.
More seriously, the format is unsuitable for random access. It must be fully parsed and held in memory to be used.
The opportunity is most easily explained by pointing at the issue of significant whitespace: it should not be a language feature, but an editor option. Those that want it should get it, no need to fuss at the language level. It is not semantics. Source maps could do the trick. One could edit a source mapped version, and have the edits piped back to the source.
I'm sure there are lots more applications that make sense for an editor. Imagine having to edit a big table with lots of columns. If one could supply the editor with a filter that pairs the text down to the interesting tidbits, with a source map, edits could be mapped back to the source. That would be a big win.
Another trivial example are coding styles. Everybody has their own, the editor should interface that style to the code-base. I believe source map support in editors could enable this.
The biggest win would be an ecosystem of plugins for such services.
We've looked at Sentry and might use that. They offer hosted and open-source on-prem solutions.
The only case where I'd think it makes any sense is for protecting programming work from simple replication. While it isn't particularly hard to break client-side security bogus, its difficult to turn a minified mess into comprehensible code.
Source code is very effectively hidden by obfuscation.
If that were not true, GNU and open-source and GPL would not exist.
Regardless, another important thing is not to download source maps onto client's machines, as that defeats the whole point of minification.
Decoding stack traces server-side dedups work and doesn't impose an unnecessary performance burden on users.
Browsers don't download source map files unless the developer tools are opened. If your client is using your app with the dev tools open you may have other problems that have nothing to do with performance.
We have an exception service for for our company that relies on private sourcemaps and this is our approach.
I saw a post a couple years back that suggested that DWARF debugging symbols could be used as an inspiration for improving sourcemaps: http://fitzgeraldnick.com/2015/06/19/source-maps-are-insuffi... .
Once they're done a new generation will feel it is too complicated and go back to basics with a new language. And reinvent all those wheels again. This is the magic of software: every decade you can pick low hanging fruits again to improve your resume.
js source maps remind me more of stuff like linespecs that you can give to gdb/lldb for mapping.
SourceMaps can be used to global names, but not locals. However, Closure Compiler has the ability to generate more elaborate maps that do allow reverse mapping. This is how Google services deobfuscate logs and stack traces sent back by heavily optimized clients.
Is that an extension to source maps that Google came up with? Because from the current specification I do not see how this can be done without hacks. The way we're doing it is inverse token search from a starting position over the minified tokens until we find our minified function name followed by the keyword 'function'.
And that approach is slow and not entirely correct.
On the server side, you can store a lot more information since it doesn't need to be transmitted to the client. Closure Compiler stores maps for all variables, all properties, all functions, all renamed strings, all idGenerators, etc. You can store these if you want.
Google's servers store these maps and when user feedback or exceptions are logged, they are used to deobfuscate them. SourceMaps + functionMaps + propertyMaps + the others I mentioned as used to deobfuscate.
This doesn't solve the problem of deobfuscating locals or heap objects. That needs an extension.
For example, to map binary locations to line numbers -- it uses its own opcodes and registers in a custom VIRTUAL MACHINE(!!)
When it could have had a simple table like:
Then just compress that with an ordinary compression algorithm rather than inventing silly virtual machines.
The debug info itself is encoded in a very cumbersome encoding of debug-information-entries that point to one another.
A much simpler method would have been RLP, for example.
This was compiled with debug info turned on. Where did the memcpy line go? Turns out, it can be optimized out entirely and the whole function combined. There's no possible mapping back to source code. DWARF allows you to transform this back to stupider, "unoptimized code", which can be stepped through line-by-line.
This is one basic reason why Source Maps are useless: in anything but the stupidest compilers, transformation back to the source code is more complicated than table mapping.
There is absolutely no justification in inventing a virtual machine with its own opcodes for this.
Not sure how a compression is going to be better than a VM. The "vm" here is super simple and achieves significantly better compression than an actual compression algorithm. And it's easier to implement and work with. Also again this is not just line information so you really want a state machine for this or this explodes in more and more complexity.
We built a system that generates out simple mappings from DWARF's line number programs to files we can mmap and it's only smaller for the case we are about (line number info). Anything else and DWARF's programs are better. So no surprised DWARF works the way it does.
A simple sorted address->line table with binary search is incredibly faster.
This is a very common use case. At the very least, this proves DWARF is not designed for its common use cases, not properly at least.
Sure, but so are sourcemaps. If that is all the info you need then you can build tables for that which is as mentioned precisely what we do. However DWARF is more than that and DWARF is a really good standard for debug information data. You can trivially build cache files for the subset of info you need out of them.
We use DWARF at Sentry just fine and I am quite a big of a supporter of the format as ypu can guess. In particular it was designed and specified unlike sourcemaps which are a random google docs page and don’t even solve basic problems such as finding out which function a token belongs to.
I’ve seen a somewhat surprising amount of people say that they absolutely wouldn’t these days, which seems fairly security-through-obscurity. Would love to hear some different opinions on this, though.
You're correct that it's security through obscurity. IMO, it's an irrational fear.
Regarding the security aspect - don't put anything you know to be sensitive/insecure on the Internet in the first place.
You're not obligated to make it easy for someone to reverse-engineer/steal/clone your client code.
As it stands sourcemaps are pretty much useless for a wide range of issues that we have with them and attempts to fix them went nowhere.
Considering this wouldn't make much of a dent in the work needed to get tools like gdb/lldb to work with WebAssembly engines I don't think it's likely to be the best path forward for WASM.
It's not a waste if you can leverage what your teams existing knowledge instead of learning the quirks and tricks of make.
It's not a waste if you can use your existing tooling (editor plugins, debuggers, unit test frameworks...) to process your build files.
I don't see how your argument holds ground. If everyone used make, tooling would support it, and you wouldn't need to learn the "quirks and tricks" of a dozen other systems.
The problem with make is that it's not portable, since it needs to call system specific binaries, like rm(1) vs del. There's also the BSD vs GNU divide.
Other than that, a newfangled tool like gulp doesn't even have a clean one liner for copying one file to a different location with a different name. It's absurd.
Compare to: cp srcfile destloc/destfile
Edit: In addition, the above gulp command reports success even if the file doesn't exist. If you want to check that, you need to import another library and expand the command with another pipe.
However CSS source maps are of a great help for less/sass.
* Some pages - specifically real-time apps - need lots of JS bloat. I get and accept that. Happily I don't need to make any of those beasts.
I can't really see a case where that is a bad thing, particularly when you have source maps to get you back to the original source code, anyway. Your stance just seems very naive.
I'm not sure how we got to the point where something that worked without minification on year 2000 networks no longer works without per-processing on year 2017 networks.
You can't run C without building it, so you have to do some kind of compilation. I've certainly known people to argue for e.g. including debug symbols even in the build you ship to customers - the savings from stripping them aren't worth the added complexity and difficulty of debugging.
They are not the same any more than a 6 line python script is the same as Django
A few examples are available, but you can also drag in your own files to view those.
What is the advantage of the source map?