Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: FlowTracker – Track data flowing through Java programs (github.com/coekie)
280 points by coekie 27 days ago | hide | past | favorite | 32 comments
FlowTracker, a Java agent that tracks data flowing through Java programs. It helps you understand where any program got its output from, what it means, and why it wrote it.

Watch the video or explore the live demo yourself, and read how it works at https://github.com/coekie/flowtracker




Cool! I wrote something on the same spirit but for Clojure, called FlowStorm http://www.flow-storm.org/

For instrumentation, instead of an instrumenting agent it uses a fork of the official Clojure compiler (in Clojure you can easily swap compilers at dev) that adds extra bytecode. What is interesting about recording Clojure programs execution is that most values are immutable, so you can snapshot them by just retaining the pointers.

Edit: Since the OP demo is about exploring a web app for people interested in this topics I'm leaving a demo of FlowStorm debugging a web app also https://www.youtube.com/watch?v=h8AFpZkAwPo


This is beautiful! great job! - What was the reason you choose javafx? After you choose fx, did you look at cljfx?


Thanks! I started with cljfx and then moved to pure javafx, first because it wasn't straight forward for me to understand the performance overhead of cljfx under different scenarios (when using subscriptions and contexts), and second because I want to drag as few dependencies as possible so they don't conflict with debuggee one's.


Nice!

Do you like use data structure metadata for tracking values?


Not sure I follow, can expand on that? I gave a presentation on it recently https://www.youtube.com/watch?v=BuSpMvVU7j4 which goes over demos and implementation details if you are interested in those topics.


I meant this [0], and so tracking a data via tagging it with a metadata and seeing where it ends up.

Thanks for the video, I'm gonna go watch it.

[0[: https://clojure.org/reference/metadata


So recording in FlowStorm doesn't use Clojure metadata capabilities in any way, it is basically about storing function calls, function returns and the pointers to all expressions intermediate immutable values together with their code coordinates.


This is incredibly cool.

I love how good the tooling is in the java/jvm ecosystem. Last time I was this blown away was with jitwatch ( https://github.com/AdoptOpenJDK/jitwatch )

FlowTracker reminds me a little of taint analysis, which is used for tracking unvalidated user inputs or secrets through a program, making sure it is not leaked or used without validation.

search keywords are "dynamic taint tracking/analysis"

https://github.com/gmu-swe/phosphor

https://github.com/soot-oss/SootUp

https://github.com/feliam/klee-taint


Blown away by the demo tracking an HTML element back to the SQL statement that added that value to the database.

I can totally see a future where tools like this are the first line of defense when troubleshooting bugs.


Thanks.

As I was developing FlowTracker, a lot of the work was driven by making tracking of specific example programs work. I knew what result I was aiming for, but it was hard to predict what lower level mechanisms needed to be supported to make a specific example work. That often depended on internal implementation details of the JDK or libraries being used where the data was passing through.

But the HTML element linking back to the SQL script that added that data into the database wasn't like that. I didn't expect or work towards it, that just happened, so it blew me away a little too and got me excited about what else this approach could accomplish.


A great example of how design of good products should be guided by the end goal instead of by the technical mechanism, when possible. You went out of your way to make sure the functionality was not limited by a certain single mechanism.


I didn't make it to that element of the demo because I don't need a tool to help me find which file HTML text strings are from or that HTTP headers come from my web server. So I would recommend putting that "wow" element earlier in the demo.


Or maybe split the demo into shorter demos/gifs where each highlights a specific part. Very cool project, should get more attention.


When you think about it, so many problems could have been prevented and so many business rules could have been easier to express if there was some standard way to track the origins and veracity of data.

Maybe also some way to track if the data is meant to be transient or meant to be written back.

The more such constraints which could be described up front, the better.


I am not really sure if I get the full picture and how it might be used - but it somehow reminds me of a Smalltalk environment where I can also inspect everything (all are objects and messages and you can trace back and interact with it those).


Very cool! I love the demo video and I could definitely see how this would be useful when diving into an unfamiliar codebase.


Years ago I experimented[1] with a similar concept (wanting something like JavaScript source maps, but for HTML). I didn't manage to find the time to expand on it, but I think web developer tooling would really benefit from this sort of full-stack attribution.

Integration of any solution like this into existing frameworks feels like a big challenge.

[1] HTML Source Maps - https://github.com/connorjclark/html-source-maps https://docs.google.com/document/d/19XYWiPL9h9vA6QcOrGV9Nfkr...


This reminds me (in the best way possible) of the Eve-lang demos of debugging a program by simply asking "why is <the UI element> not here?" Fantastic work!

https://www.youtube.com/watch?v=TWAMr72VaaU&t=164s and https://witheve.com/


If I recall there was a paper on a similar tool that was used for finding SQL-injections dynamically in java programs. Is this the same tool?


No, that must have been something different.

It would be possible to extend what FlowTracker does to also find SQL (or other) injection vulnerabilities. So it's possible the tool you're thinking of used a similar approach.


Once I had the vision to track data over the internet, like where came the image from, on which cdn was it. Or "what did this string have seen from creation till it reached my screen". This is a step into this direction.


Thanks for this!

Been trying to get this work with VSCode with a project I'm trying to make sense of. Having to take a pause on it right now, but looking forward to getting it working and playing with it.


This is awesome! Reminds me of Java flight recorder!


This looks really cool, I think this could have saved me some time hunting bugs when I was working with Spring in the past.


This is pretty cool! Do you think something similar is possible for c#, too?


There's an event system that is integrated into all kinds of bits in the standard library and surrounding ecosystem, which plugs into all kinds of high level tools: dotnet-trace/-monitor/-counters, profiling in VS and Rider, etc. There are also telemetry hooks but I have not looked into them in closer details, supposedly that's what .NET Aspire uses.

I think a similar experience can be quickly achieved with tracing in aspire: https://devblogs.microsoft.com/dotnet/introducing-dotnet-asp...

It's a bit different but I don't know if anyone made a quick handy GUI tool to hook up to .NET's EventPipe and display its data in a nice way, but the extensive API for that is there.


Thank you. To be honest, I hadn't heard of Aspire before, so TIL.

From a quick skim read, it sounds similar in some aspects. And also it's a good starting point.


That is really cool, really like that there's a browser demo too


Hmm would love to connect this to our gralde builds :)


I know you may be tongue-in-cheek but if you're on the .gradle.kts flavor there's a reasonable chance it would work. The Groovy ones are, I suspect, just entirely too dynamic dispatch for it to make any sense (e.g. all flows are from org.groovy.SomeRandoThing and good luck)


Very very impressive.


Very impressive!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: