Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Compare Analysis Tools For Python, Ruby, C, PHP, Go (analysis-tools.dev)
175 points by mre on Aug 20, 2020 | hide | past | favorite | 57 comments

Hey, this post got more attention than I thought. Happy to answer your questions and get some feedback on what to improve.

Maybe people are interested in some tech:

My colleague Jakub and me built this site with GatsbyJS and Cloudflare Edge Workers. The 99th percentile of response times from the workers is currently 9.7ms, which is impressive.

The code is fully open source on Github [1].

It is based on submissions by 190 individual contributors so far [2]

We went for an open model and completely depend on Github sponsors for the funding. We are not trying to rapidly grow here, rather build a steady business.

You can read more about the buisness model in our first blog post [3]. If your company might be interested in sponsoring, let us know or check the offerings here: https://github.com/sponsors/analysis-tools-dev/ <3

[1]: https://github.com/analysis-tools-dev/website/ [2]: https://github.com/analysis-tools-dev/static-analysis [3]: https://analysis-tools.dev/blog/static-analysis-is-broken-le...

Nice list! But I'm curious how you plan to improve the comparability? Static analysis covers a wide range, and I'd dare to say a beautifier is a little bit different than a rule checker, which is again a little bit different than an absence-of-error prover.

Disclaimer: I work in the field and am probably a bit biased ;)

Good point. Couple of thoughts: It helps to put different tools (formatters, linters) into separate groups per language. On top of that, adding the mode of execution (rule-based, AST-based and so on) would help with classification, too, but it's a lot of work that we could need help with over on Github. One thing that has become quite obvious for us is that people are interested in filtering based on web-services, commandline tools and so on, which we will add. That data is already there, we just haven't gotten around to adding filters to the site. Same for filtering by proprietary vs open source licenses. Many other categories come to mind. The challenge is to make it a helpful tool for beginners in the field while also adding features to drill down on the details. If you or anyone likes to discuss that further or wants to help build it, head over to https://github.com/analysis-tools-dev/website/issues/11. That's one big advantage of building a product in the open. :)

I'm not familiar with Cloudflare Edge Workers, so sorry if this is an obvious question, but what kind of data store (database) can you use in a setup like this?

A while ago this[0] hosting solution of blackblaze b2 and cloudflare workers looked interesting. I believe it uses a aws s3 compatible api[1].

[0]: https://jross.me/free-personal-image-hosting-with-backblaze-... [1]: https://blog.cloudflare.com/backblaze-b2-and-the-s3-compatib...

Cloudflare provides a simple key-value store for values up to 2MB (https://www.cloudflare.com/products/workers-kv/). We don't use that, though and instead use firebase on GCP plus a daily data backup at the moment.

The glaring error right at the very first entry seems bad. Black is not an analysis tool, it's a code formatter.

The `formatter` tag is already part of the underlying data [1], it's just not reflected on the website yet. We plan to put formatters into a visually separate group per language and add a filter functionality for the type (formatter, linter, metalinter, web-service).

[1]: https://github.com/analysis-tools-dev/static-analysis/blob/0...


A linter that shells out to other linters and combines the result at the end. Example: https://analysis-tools.dev/tool/go-meta-linter

hadn't heard that term before, thanks!

It looks like those tools are sorted by votes, but some of them can analyze different languages, and votes are shared between their languages.

For example, CodeScene, which supports 12 languages, is the currently most voted tool for PHP, and I've never heard of it. Not saying it's bad or anything, but I highly doubt it's popular in the PHP community, compared to other products.

Oh yeah, we realized that as well. What do you think about putting services for multiple languages into a separate group and exclude them from the votings for the individual languages?

I think it would make more sense to vote on tags. Then you could compare multi-language formatters with single language by how many votes they had for that tag.

Iirc imgur had (has?) a system like this in addition to the main up/downvote (but I don't think you need a main up/downvote, since you're creating a list, not a feed).

That... makes a lot of sense to me. Probably even going away from a simple upvote/downvote towards a 1-5 star rating system would help as well. In the end we're interested in the quality of the tool (and the number of ratings).

Evan Miller argues that it's probably simplest to stick with thumbsup/thumbsdown, but not to fall into the trap of using dead-simple analysis methods like "Score = (Positive ratings) − (Negative ratings)".

Instead he argues you should use the "Lower bound of Wilson score confidence interval for a Bernoulli parameter". He provides that equation and example code to calculate it (Ruby, SQL, and/or Excel).






(Found these links shared by another user a couple of weeks ago[0])

0: https://news.ycombinator.com/item?id=24089960

This is something I'll really want to look into. Right now we are using mentioned "dead-simple" method ;-)

you'll end up with a skewed rating anyway (either 1 or 5 stars for most ratings).

When I was looking into writing recommendation systems, one paper made the interesting observation: in a 1-10 rating system, the only places where people will naturally agree upon is 1,5 and 10. Anything else can only be evaluated relative to the same user's other scores. (Comparing my 9 and 10 scores has meaning, but not comparing your 8 and my 9)

The polarization is a natural result of the lack of definition

So it's not worth the trouble then?

Please do not exclude them! I bet some of them are very good, especially since you support "similar" languages (JavaScript / TypeScript f.e.).

smichel17's solution makes the most sense to me.

For real. Phpstan and phpcs for life.

Shameless self-promotion time: TypeScript Call Graph

A CLI to generate an interactive graph of functions and calls from your TypeScript files.


Looks nice! I built a couple of tools similar to that for Java that helped me in refactoring large codebases:

https://github.com/krlvi/jar-graph for visualising interclass dependencies

https://github.com/krlvi/java-rdeps for recursively discovering all transitive call sites of a given Java method

https://github.com/krlvi/cyclist for detecting cyclical dependencies between packages in the same Jar

Sharing just in case it is useful to somebody else

Looks great! The threshold for addition to analysis-tools.dev is (an arbitrary) 20 stars. If you like, you can already create a pull request here: https://github.com/analysis-tools-dev/static-analysis/blob/m... Our list is open source, we just render a website version (also open source) from it for some more functionality.

Thank you -- I saw the 20 star cutoff for submission. Perhaps having posted my link on HN I'll get the 8 more stars I need ;)

Well, you have one more now. ;)

It would be useful to list which tools support the SARIF standardized format (https://sarifweb.azurewebsites.net/).

Interesting. Never heard of it, but I created a ticket on Github to investigate. You can watch it to track updates: https://github.com/analysis-tools-dev/static-analysis/issues... Thanks for the feedback!

How does Sarif compare to LSIF?

I had never heard of LSIF. Its webpage says it intends to "define a standard format for language servers or other programming tools to dump their knowledge about a workspace". SARIF being targeted towards static analysis tools results, it seems more straightforward to support it both for producing results and for reading them.

To be honest, since both formats are at least partially developed by Microsoft, they might have some common people behind them, but as a user and producer of static analysis tools, I'd prefer to have a single standard. SARIF is already standardized and gaining traction, so it seems the better choice for now.

Now we need a meta static analysis tool that reports from all static analysis tools

Heh, already done. https://endler.dev/2017/obsolete/ :)

Doesn't look like my project meets the eligibility requirements yet, too new. So I'll just share here for anyone interested.

Luanalysis - An IDE for statically typed Lua development. https://github.com/Benjamin-Dobell/IntelliJ-Luanalysis

You can already create a pull request and then we'll merge it as soon as the acceptance criteria is met. In the meantime you'll get some exposure through the PR on https://github.com/analysis-tools-dev/static-analysis as well, which is where we get some traffic as well (~500 unique visitors per week).

Sadly, many commercial source auditing tools like Coverity expressly forbid you from publishing any comparison or benchmark of their products, which is why you won't find great information out there.

What would be the repercussions if you still do? Assume the website is hosted in some banana republic...

I wonder what might be the motivation behind that. Sounds like a strange policy if they have confidence in their product. Maybe it's for legal reasons.

It doesn't allow them to control the narrative about their product. Suppose the reviewer doesn't know what they're doing, and has the wrong settings selected. Or uses code samples that are devilishly difficult (e.g., heavy use of function pointers) so that the results show the product in a very negative light.

If I build a car and someone shifts the gears without using the clutch, breaks the car and then whines about it, that might be irritating but totally legal. So I wonder if that sort of business practice would hold in front of court.

That is really comprehensive and useful, thank you.

Thanks for the feedback. Glad you liked it. If you (or anyone else) find the time, please vote on a few tools and perhaps also add a comment for a tool? That would help us a lot as we basically have zero comments on the page right now and it's hard to crowdsource them.

This site could really use a filter option for cost of the product being mentioned.

Yes, that's the next feature we're working on. You'll be able to filter by license, maintenance level, and type (commandline, service,...). Jakub is currently restructuring the YAML source to allow for that (https://github.com/analysis-tools-dev/static-analysis/pull/4...). We wanted to launch early to gauge if we're heading in the right direction. Thanks for the feedback.

Never heard of Teamscale before, yet it has a lot of votes in all its categories.

Also one of those expensive ones.

We'll split up votes per category in the future. This way tools that support multiple categories won't have a competitive advantage in the future.

Great work! I think it would be helpful to tag static analyzer tools that are dedicated to security with a security tag (SAST tools like, Brakeman, Fortify SCA, Checkmarx CxSAST, Coverity, etc.) OWASP lists a bunch here: https://owasp.org/www-community/Source_Code_Analysis_Tools

Good idea. Will do!

How can I easily integrate all of them in my CI?

It sounds a bit ironic, but I'll bite: you shouldn't. Rather carefully pick a few tools based on peer-review, license, and your goals and start slowly. Too many tools at once will cause analysis fatigue.

That said, in the future we hope to provide a paid video course about static analysis for e.g. Python and Go if there is market interest.

Nearly all of the major security-focus static tools have CI integration, and guidance on best practices for integration.

I'm not sure voting works?

It does work, there is just no instant feedback. Voting happens through firebase and when you reload the page it's not reflected immediately. Anyone knows how other platforms like Reddit solve that?

It seems not to? I voted on something last night (selfishly, my own tool), but it's still not reflected.

Static pages with the votes get rendered once per day. It might be that you just missed that deadline yesterday. We'll look into it now. Please ping us if you can't see the votes by tomorrow.

Am I the only one worried about unauthenticated voting? I guess it'll become a major headache pretty quickly. Imagine people using bots to vote "up" on their favorite tools and "down" on the others.

Yeah this is concerning us as well ;-)

So we have two measures in place right now: - We rate limit voting requests - You can vote just once per tool

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact