The calculator at the bottom of the page is doing some weird calculations. If you have no incidents a year, it still costs you money. So do incidents that take zero minutes to resolve. I took apart the code, and this seems to be the equation in use:
How does this work on the backend? Does it only trace method calls when an exception is thrown, or does it profile the call stack of every request?
Something I've been interested in is the performance impact of using https://docs.ruby-lang.org/en/3.2/Coverage.html to find unused code by profiling production. Particularly using that to figure out any gems that are never called in production. Seems like it could be made fast.
Also it is only necessary because of the lack of type enforcement which means no code can be relied and on and all code has to be constantly inspected for new bugs. Ugh.
Imagine a million-line codebase. There are half a dozen suspicious methods with complex sets of if/else if/else statements. And each of those statements make subsequent method calls.
Determining that code path is a nightmare. Types won't save you.
Types absolutely do prove logical correctness. In fact that’s all they do. However they can only prove the correctness of logic that’s type encoded. If your program is a primitive soup, there’s not much logic for them to prove.
I can see this being useful in strongly-typed languages too. There is a massive class of logical bugs that will type-check correctly, that will still result in wrong results being returned.
I built a toy raytracer in Haskell for fun. I found all of these bugs. It turns out that when you implement dot product slightly wrong, the image output is very confusing.
Even just redundant call paths are useful to spot. I recently looked through the standard library of a strongly, statically typed language and found that in one pretty basic function a validation function was called over 8 times despite reliably returning exactly the same result every time and this tool would highlight that very easily.
That's not even mentioning the logic bugs you can spot more easily as well.
Recently did something similar for a java project using AOP. Basically adding an annotation to each method and logging the parameters before the method call and return values after the method call. Whenever there is an exception, a mail will be sent with the stacktrace along with the entire request path(including method calls, parameters and return values). Extremely useful for debugging and to proactively fix the issues.
Curious how you do this performantly for any non-trivial codebase. Like, consider a class whose logging representation is the data it contains, which can be arbitrarily large. Generally this is not an issue because it’s only logged rarely and on designated paths where people actually care about this and it was expected to be used in that fashion. How would this work for a large object that you pass to a function repeatedly, or in a deeply nested stack trace?
The project I worked on has a non-trivial codebase. So far I haven't seen any performance issues though I was worried initially. The idea is to use it during development and beta testing and switch it off later once the application is stable enough. Might keep it on for some more time if there are no performance issues.
1) In a large-scale production scenario, you typically do not have the data, nor the interaction flow, to reproduce the bug locally. The idea is that you enable Call Stacking on the fly, when needed. Turn it off when not needed.
2) Having multiple runtime captures of the same endpoint across two different deployments or time periods allows you to quickly compare for logic or data changes (argument values and return values are visible).
3) Commenting on individual lines of execution allows for the team to have a specific discussion surrounding logic changes.
Would this mean that any data I happened to have in memory during the flow now permanently lives in callstacking's data stores? How does it handle all the data flowing through from a security perspective?
It respects the normal RoR toolchain parameter filtering, so anything that you say is sensitive (or everything by default, if you'd like) also doesn't get sent to CallStacking.
The goal is to quickly be able to see just the important, executed methods for a given request.
E.g. you may have a 2,000-line User model, but Call Stacking allows you to pinpoint, "Oh, only these three methods are actually being called during authentication. And here are the subsequent calls that those methods make. And here's where the logic change occurred."
When the request is completed, the instrumented methods are removed (thus removing the overhead).
You have to enable it judiciously. But for a problematic request, it will give the entire team a holistic view as to what is really happening for a given request. What methods are called, their calling parameters, and return values, all are given visibility.
You no longer have to reconstruct production scenarios piecemeal via the rails console.
I'm not familiar with rails, so sorry if your reply above inherently answered this.. So you're enabling the tracing with a call in your controller method, but how is the tool capturing function params and returned values for sub-calls in the respective controller method?
Is it waiting for execution to return to the controller method and polling the stack trace from there?
as far as I can tell, it only executes the trace when asked. It's not an APM like newrelic. Most likely the trace meaningfully slows down the individual request.
When I was at ScoutAPM, we built a version of this that was stochastic instead of 100% predictable. We sampled the call stack every 10-50ms. Much lower overhead, and it caught the slower methods, which is quite helpful on its own, especially since slow behavior often isn't uniform, it happens on only a small handful of your biggest customers. But it certainly missed many fast executed methods.
Different approaches for sure, solve different issues.
The irony is that the issues this helps with could be solved far before production. Compile time, or some local runtime even. Just not in Ruby.
Nearly all the issues this shows you quickly are issues that static typing would prevent compile time, or type-hints would show you in dev-time.
I've been doing fulltime Rails for 12+ years now, PHP before that, C before that. But always I developed side-gigs in Java, C# and other typed languages and now, finally fulltime over to Rust. They solve this.
Before production. You want this solved before production. Really.
Of the bugs that I've experienced in large-scale, Rails production systems, typing is a small subset.
Manually reconstructing logistical errors based on a combination of user input and system data, are the most time-consuming issues to diagnose.
When your codebase is 500,000+ lines of code, which code paths are relevant for a given endpoint? What methods were called and under what context? How do we begin to reconstruct this bug?
These are the scenarios for which Call Stacking gives instant visibility to.
> Of the bugs that I've experienced in large-scale, Rails production systems, typing is a small subset.
Really? Are you counting errors where a value turned out to be nil when it wasn’t expected to be? Because that’s a type error. It’s not a type error that many statically typed languages fix (Java is notorious for null pointer exceptions) but it’s a type error that can, in principle, be fixed with static typing.
I'm always surprised by that, I've worked with 10-15 people team on relatively large rails codebase and, yes, the bugs we usually see are not type related bugs (including nil when it's not expected). I keep reading people saying that type systems eliminate a huge class of bugs but, it's not been my experience with languages that use poor type systems (java, rust... ). Languages that have much better types like OCaml are different and I've had great luck with OCaml in particular (my favorite language to use when I can) but, there's also the fact that devs who use ocaml tend to also be much better than average (in the same way that in my experience devs working with php or nodejs exclusively tend to be much worse than average).
Note before I'm downvoted for my last comment, there are exceptions but looking at the average candidate applying for a php, nodejs position compared to one for rails and one for ocaml.
In the app I work on a huge percentage (60% in the timeframe I analyzed) of the errors are errors a type system would catch. Nil references, long method chains where the middle call returns a different object than expected, bad refactorings changing the return type of methods…
It’s staggering really. These could be solved with better programming practices, they’re not errors usually experienced programmers make, but they exist nonetheless.
An exception/error system, with checked exceptions or handling of errors via result monad/errors like in Go, would solve the vast majority of other errors.
Type systems also encourage other things like contracts on API endpoints and typed messages, serving as a early warning system when writing those bugs.
Purely business logic bugs are actually not common. I guess we haven’t had the opportunity to create those “higher order” bugs while wading through the others.
In this case, I guess you can create some ruby gem/library/convention/dsl/architecture that makes it harder to accidentally pass nils - and be more efficient than a popular language that has static typing.
I’m writing that to address the “just not in ruby” remark[1] from earlier.
Depends on the type system. I would say for Java / Python level static types they catch a small but significant fraction of bugs (10-20% according to the only objective measurement I've seen, which is easily worth it). However some languages like Rust, Haskell and OCaml let you express much more in the type system.
Subjectively it feels like that catches more like 30-60% of bugs.
So this thing is still useful but Berkes is right that you need it a lot less if you use better static types.
> which code paths are relevant for a given endpoint?
This is exactly the question that static types can answer... statically. You don't need a runtime log to find out.
You do need a runtime log to see the actual values though. So it's not like a debugger is completely useless in Rust. But I definitely reach for it much less than in other languages.
I worked on a large financial transaction system for up to $7B per day in Haskell that used formal methods like Agda, TLA+, etc, with a lot of logic in the type level (i.e. Liquid Haskell), as a test engineer. The entire system was covered in property based tests, proofs, and then normal tests from unit -> system. We literally had two bugs categorised as P2/P1/P0 on release, both of which were design related, and both were fixed before users saw them. It was crazy effective (but took years).
> This is exactly the question that static types can answer... statically. You don't need a runtime log to find out.
This tool seems to be able to display the relevant code paths. That sounds super convenient and useful. Do statically typed languages have tools to do similar?
Yeah "find all references" will show you everything that can call a particular function. As I said it won't give you the actual values so this still seems useful.
I don’t know if a similar tool exists but in principle Java and .NET has profiling APIs that let you look into method calls. They can be used as a basis for a similar tool.
Any `nil` error, like the famous `undefined method 'users' for nil` is a type error.
Every time that serialization gets the wrong value passed in but continues anyway, is a type error. Every time a database-record misses a value (NULL) but the app continues to run over that, is a type error. And so on.
When I look at my most stable apps' Rollbar or Sentry, the top 20 errors are nearly all errors that a type system (which does not allow |null, like Java's - ugh, useless) would've caught compile or dev-time. The very few non-typing errors that are then left are race-conditions and business-logic-bugs.
The latter are really the only bugs that I'm "fine" with, they come with the domain. Race-conditions are quite often deep down also typing issues - they surface as similar `Undefined method on nil` errors because some data is nil due to the racing threads. Something a typing system would partly fix - as we can see in Rust.
I have not come across this sentiment before. Is there something specific about Ruby that makes you think this way or is this your general view of dynamic languages without a strong type system?
Well, if you never get around to releasing the feature, you'll never have production issues. Ruby embodies "perfect is the enemy of done" in a way that I tend to appreciate.
There are a couple things wrong with it. The third * should be switched with a + and the last term need to be multiplied by the number of incidents.
Which if anyone at Call Stacking is here, just means changing to or more succinctly I'm assuming that's minified, so Edit: With the correct math, the example is wildly different. It should be $37,277.81, not $87,991.23.