
Automated code review for Python, Django, etc. - cneumann81
https://www.quantifiedcode.com
======
sulam
No offense, but saying this is 'automated code review' and specifically
mentioning something like "comply with regulatory standards" is missing the
point. The whole point of these regulatory standards (such as SOX) is to put
an additional human in the loop, so that a single engineer can't do something
obviously bad with people's credit cards, or whatever it is.

Also, given that this is glorified linting, it should be run _before_ code
review. Any code review that focuses on issues that can be picked up via a
linter isn't actually a review. Code reviews are a higher level analysis.

~~~
noir_lord
For iteration speed in-IDE linting/checking makes sense I think.

~~~
ThePhysicist
Agreed, we're working on a version of the tool that can be integrated into an
IDE.

------
jquast
This looks like a poor clone of [http://landscape.io/](http://landscape.io/)
\- viewing what they have to offer on github, I'm not very impressed at all,
the internal tool of landscape.io,
[https://github.com/landscapeio/prospector](https://github.com/landscapeio/prospector)
does a whole lot more.

I think calling it "code review" as opposed to "health" or just "lint checker"
is marketing bullshit aimed at managers who think all work of engineers could
be magically boiled down to automation.

That said, I make _heavy_ use of prospector, along with or wrapping: dodgy,
frosted, mccabe, pep257, pep8, pylint, pyroma, restructuredtext_lint, doc8,
and sphinx-build "warnings as errors" as part of my builds.

Result: Code that becomes a public source reference used by other projects.
Contributions from people who are able to quickly dive in and find where to
place the change they wish to propose. Through enforced styling, the entire
codebase reads as though it has a single authorship, "same throughout", this
can make ~5K LOC as easy to adapt to as another of 1/10th its size.

I took a ~25K LOC open source project, ran it through landscape.io, worked
(along with closing many other bugs) from "50% health" towards "97% health",
and the readability and maintainability of the project increased drastically.
It does work.

Why are these checkers useful? Because I want people who review code,
especially in business, to spend their time on the purpose of the code, and
not spend a single minute of their time considering whether this suites the
"style" of their engineering department. I have seen too many people use code
review for pedantic styling.

If I can have the tools reject the code before a person, it lowers the
time/cost of a purposeful review. It also hurts less ego's to get only 1 or 2
open issues as opposed to 15-20.

Crude napkin estimates of current and past employers gauge the value of the
software at roughly $10-$25 per line of code. It should be treated as such.

~~~
ThePhysicist
Thanks for the feedback! Compared to landscape.io (or other tools) we have a
different approach to code checking since we develop our own algorithms and
even allow our users to create custom code checks using a code pattern
language
([http://docs.quantifiedcode.com/patterns/language/index.html](http://docs.quantifiedcode.com/patterns/language/index.html))

landscape.io is a great tool of course (in fact we've been in touch with Carl
since a while) but we really wanted to go in a different direction with this
and build a tool that goes beyond providing a simple frontend to existing
tools. The current beta of QuantifiedCode is of course just the beginning and
does lack many features that we're currently working on (e.g. giving the user
access to the graph representation of code that we use internally), so stay
tuned for more.

BTW we have open-sourced our own "Prospector"-like tool, checkmate
([https://github.com/quantifiedcode/checkmate](https://github.com/quantifiedcode/checkmate)).
Right now it's still in beta though and we don't pursue it very actively since
we use mainly our graph-based approach to code analysis now (so we don't need
to aggregate output from different tools).

~~~
jquast
I did look at checkmate. It was my mistake in believing this was the total sum
of the checker used in quantifiedcode.

One problem I have with such monetary solutions is that in business where I am
willing to pay, I could not provide a SaaS tool like codechecker access to the
enterprise github hosted on an internal network (or possibly some other
repository service).

I'd be happy to pay, but it just isn't technically feasible unless there is a
self-hosting solution, which complicates things for you beyond what is
profitable: I don't know what the solution is, but this is why I use only cmd-
line tools like prospector in business, rather than simply paying for
landscape.io or codechecker.

I'll stay tuned, thanks for responding.

------
mkoryak
I am hesitant to try this because of the pricing. Maybe its just me, but
paying 150 bucks per month for a gloried linter seems insane.

Do people actually pay this much for this product? I would love to see some
testimonials on the site, because otherwise I cant see myself liking this
150/mo worth

my 2 cents

~~~
wyldfire
If they'd have called it a "static analyzer" they might have had an easier
time selling it. Businesses regularly shell out big bucks for C/C++/Java
static analysis tools like Coverity, Klocwork, etc.

Static analysis is less likely to yield fruit (design errors and not false
positives) in a language like Python. So the bar's definitely higher for these
guys than it is for the static analysis tools for static-typed languages.

~~~
ThePhysicist
Good point! You're absolutely right when you say that static analysis is more
difficult to do for dynamically typed languages like Python. This is why we
develop our own type inference algorithms in order to generate additional
knowledge about the code beyond its structure, which in turn allows us to
detect more interesting and relevant problems.

------
sirclueless
Is there a way to silence particular errors (i.e. false positives). For
example, the python linter is unhappy that I have the following in my code:

    
    
        ParticipantFormSet = modelformset_factory(Participant, can_delete=False, max_num=8, extra=8,
                                                  form=ParticipantForm)
    

It claims that this is a violation of the naming convention for variables, but
in this case ParticipantFormSet is actually a python class, albeit created
with a factory function.

This is just one example, but I would think long-term that it would be a great
annoyance to deal with persistent misapplication of rules.

~~~
ThePhysicist
Thanks for the feedback! This feature will be available very soon, we'll
notify all users by e-mail as soon as it's deployed.

Right now, you can already disable certain patterns in your project settings
if you're bothered by individual messages.

------
eliben
The most important part of a code review is the _why_ , not the _how_.

It's pretty easy to implement pre-commit checks that would validate and
enforce style and certain lint checks - many companies have these. No
automatic check will make sure that the right problem is being solved with the
right approach, though.

~~~
dkhenry
Having never used this specific product I can't makes claims about any of its
functionality, but what I have found is the gap between what one _can_ do, and
what people _do_ is very very large. Yeah you can implement all this yourself
using pre commit hooks, but do you ? Right now you could pay a little money
and vastly improve the quality of your code base. There is value to be added
by getting a packaged version of something and having it just work.

~~~
ThePhysicist
Good point! When asking developers about their use of static code analysis to
ensure code quality we found that most development teams use such tools only
sporadically or not at all, and just very few teams have a systematic process
or set of guidelines for ensuring code quality throughout all of their
projects. So while some technologies already exist for this, we definitely
think that there's room for more.

~~~
gknoy
I'm not a fan of pre-commit hooks (as I'd rather write it, lint it, fix-
rebase-it), but I love having a CI suite that includes the linting tests that
our team has agreed on. It lets me code away and lint things when I'm ready to
make my pull requests.

~~~
jquast
agreed, git hooks are a terrible place, they can actually impede progress and
can also be skipped/disabled/never installed in the first place.

The time for rejecting a build for linting is at time of merge to trunk/next
release, which a good CI should already have checked.

------
Too
Jetbrains pycharm does many of the same checks for free, it's also integrated
straight into the ide so you get the error 1sec after you wrote it and can fix
it immediately. What does this offer me that pycharms analyzer does not?

------
davej
So this is a linter? What does this give me that django-lint and pylint
doesn't?

~~~
ThePhysicist
It's more a tool for writing your own linters: We have developed a code-query
language (Cody -
[http://docs.quantifiedcode.com/patterns/language/index.html](http://docs.quantifiedcode.com/patterns/language/index.html))
that allows you to very easily write your own code checkers.

Using that language we have implemented many checks from PyLint, PyFlakes etc.
but for us the most interesting feature is that you can easily translate your
own conventions into automated rules and check them on each commit.

So while we offer PyLint patterns as well we give you a way to go way beyond
what normal linters can offer.

~~~
ska
Interesting ... but I still wouldn't call it review. Both linting and review
are worth doing, but they achieve very different things. In fact, I often
think of linting as a good tool to reduce noise in the review process.

~~~
ThePhysicist
Thanks for the feedback! How would you call it? Static analysis? Linting? Code
linting seemed too be a very specific and not widely known term for us while
most people seem to know what "code review" means, hence our decision to call
it "automated code review".

~~~
ska
Well, unless I've missed something, it is by definition static analysis -- so
that's probably a good start.

The way I look at it is that every minute you spend discussing a point in a
review that could have been caught/corrected by a static tool is a minute
wasted. But these tools can't do any of the actual work of the review.

By the way, linting is a fairly well known term with a long history, but it
wouldn't surprise me if there are communities that don't know the term.

------
geromek
I've been working on static analysis products for almost a decade and I must
admit I am not impressed with your "AST visitor approach" for creating custom
checks. It seems you are doing the same as SonarQube.

However I am curious about your claim " Unlike any other code checker out
there, we’ve conceived code error as patterns, not as rules." , could you be
more specific?

~~~
ThePhysicist
Thanks for the feedback! Our "AST visitor" approach is currently not very
different compared to existing methods. However, we store the whole AST
(including earlier version of the code) in a graph for fast retrieval and add
a lot of context information to each vertex in the code tree. Having the code
patterns and faulty code in a database, we can make use of user-feedback to
(semi-)automatically improve our code patterns for a given error using
example-based learning. In that sense the algorithm is much more adaptable
than existing technologies.

------
davidthewatson
I tried this on one of my projects today. Created account via github, enabled
the project, watched the wheels spin. I'm approaching 12 hours now and it
still says: "This project will be analyzed soon. Please check back later." I'd
like to like it, but that performance isn't going to win many converts.

------
phonon
How does this compare to Semmle? [https://semmle.com/2014/06/semmle-analysis-
now-includes-pyth...](https://semmle.com/2014/06/semmle-analysis-now-includes-
python/)

~~~
ThePhysicist
I'd say: Similar approach, different target market ;) Semmle provides really
good code analysis for Python as well (we're in touch with some developers
there in fact) but targets mostly big companies with their product. Contrary
to that, our mission is to make state-of-the-art code analysis available to
everyone :)

