
Amazon CodeGuru – Preview - Sheepzez
https://aws.amazon.com/codeguru/
======
dakna
So let me get this straight:

Amazon packages open source software (Linux, Postgres etc) in a way that is an
abstracted service (RDS, EBS, Elastic Load Balancer). They add so many
abstracted building blocks that you need a special skill set to manage them
(Aws Certified Solutions Architect) instead of knowing how to do this with
bare metal or a container image running in your own data center.

And now that things are complicated and developers might make mistakes using
those services, they add a profiler that inspects your code running in
production and a reviewer that ties into the stage before deployment. All just
to optimize the use of their own services.

From a business perspective this is an awesome way to get vendor lock-in to a
much higher degree. They are basically the certifying authority that tells you
if your intellectual property (your code) conforms to their own standard. Yes,
they show examples of standard Java optimizations, but it clearly says it
detects deviation from best practices for using AWS APIs and SDKs.

And people were mad at Microsoft for shipping a non standards compliant
browser as default and enriching it with HTML tags and plugins that would only
work in that browser. Little did we know.

I personally wait for the "Amazon Compliant Code" label in the not too distant
future as a selling point for business people.

~~~
seibelj
Amazon has made scalable, performant, high-availability systems constructable
by a 10 person team that serves millions or even billions of people. Before
AWS it took thousands of people and billions in capital investment to do so.

Sure, google and Microsoft and IBM joined the party, but AWS was first and
remains the best holistically. This is their moment of domination, and
eventually something will knock them down, but they have made so many
companies so nimble and powerful in ways that were impossible before. Go
Amazon.

~~~
mythz
>Before AWS it took thousands of people and billions in capital investment to
do so.

WhatsApp Stats (2014):

\- 450 million active users, and reached that number faster than any other
company in history.

\- 50 billion messages every day across seven platforms (inbound + outbound)

\- 32 engineers, one developer supports 14 million active users

\- $60 million investment from Sequoia Capital

Which they managed their own FreeBSD servers hosted on SoftLayer.

[1] [http://highscalability.com/blog/2014/2/26/the-whatsapp-
archi...](http://highscalability.com/blog/2014/2/26/the-whatsapp-architecture-
facebook-bought-for-19-billion.html)

YouTube (2008):

"YouTube grew incredibly fast, to over 100 million video views per day, with
only a handful of people responsible for scaling the site"

\- 2 sysadmins, 2 scalability software architects

\- 2 feature developers, 2 network engineers, 1 DBA

"They went to a colocation arrangement. Now they can customize everything and
negotiate their own contracts."

"Sequoia invested a total of $11.5 million in two separate rounds and was the
only venture firm to invest in the company." [3]

[2] [http://highscalability.com/youtube-
architecture](http://highscalability.com/youtube-architecture)

[3] [https://www.nytimes.com/2006/10/09/business/09cnd-
deal.html](https://www.nytimes.com/2006/10/09/business/09cnd-deal.html)

~~~
kbenson
I'm pretty sure GP is mistaking valuation for capital. Serving half a billion
or more people probably nets you a billion dollar valuation or more these
days, but it in no way requires a billion dollars to provide that service in
the vast majority of cases.

There is a sweet spot where cloud is good and provides some benefit but, once
you're serving hundreds of millions of people and have double-digit millions
in investment, you can probably do significantly better cost-wise rolling your
own servers. Worst case, you just throw your own hypervisor management system
on them and have most of the same features you got from a cloud service. If
you're smart, you can probably architect it so you have on-demand overflow
capacity from a cloud provider in case there's a spike you can't account for,
which is the best of both worlds.

~~~
pard68
This is how we do it. Two on prem datacenters, one colo, and a handful of on-
the-ready cloud providers. We serve far fewer users, but we also are getting
20 to 50k per user per year. Needless to say, at the scale we have cloud is
out of the question except in catastrophic scenarios.

------
gregdunn
Disclaimer: I work at AWS on an unrelated team. I was not involved in
development of this product. Opinions stated are my own, and not necessarily a
reflection of my employer. Nothing here is being posted in any sort of
official capacity.

There's lots of focus here in the comments on the code reviewer portion, but
one of the things I'm most excited about is the profiler -
[https://aws.amazon.com/codeguru/features/](https://aws.amazon.com/codeguru/features/)

I do a lot of performance engineering work, and one of my go to tools for
visualizing where programs are spending their time is flamegraphs. While you
can certainly create them with profilers besides CodeGuru (and I do not work
with Java, so I haven't yet had the chance to check out CodeGuru for any of my
use cases), I'm super excited about anything that gets more people using them.
They make it very easy to see where your optimization opportunities are, and I
have personally found them very useful when working with our customers -
they're way easier, in my opinion, to go through and explain than just looking
at raw perf output or similar.

~~~
richdougherty
A profiling tool I want to try out—it seems almost magical—is Coz. It can
estimate the effect of speeding up any line of code. It does this by pausing
(!) other threads, so it gives a 'virtual' speed up for that line.

What's interesting is that this technique correctly handles inter-thread
effects like blocking, locking, contention, so it can point out inter-thread
issues that traditional profilers and flame graphs struggle with.

Summary: [https://blog.acolyer.org/2015/10/14/coz-finding-code-that-
co...](https://blog.acolyer.org/2015/10/14/coz-finding-code-that-counts-with-
causal-profling/)

Video presentation:
[https://www.youtube.com/watch?v=jE0V-p1odPg&t=0m28s](https://www.youtube.com/watch?v=jE0V-p1odPg&t=0m28s)

Coz: [https://github.com/plasma-umass/coz](https://github.com/plasma-
umass/coz)

JCoz (Java version):
[http://decave.github.io/JCoz/](http://decave.github.io/JCoz/) and
[https://github.com/Decave/JCoz](https://github.com/Decave/JCoz)

~~~
antpls
I have never heard of this kind of profiling before, thanks for sharing

------
redler
One of their screenshot examples flags inefficient code in crypto libraries,
and the suggested "fix" is "Evaluate switching to the Amazon Corretto Crypto
Provider ACCP". I don't know enough about the subject matter area to know
whether that's the right move, but it's interesting that CodeGuru is
apparently, among other things, an opportunity to pay Amazon to upsell you on
replacing some of your code with one of the panoply of services in the AWS
universe.

~~~
ensignavenger
Amazon Corretto is their OpenJDK Java distro that is free and Open Source. I
don't know if the project was already using Corretto or not, but it makes
sense for them to recommend their own, supported, open source solution.

~~~
redler
Your point is a fair one, and I admit unfamiliarity with Corretto,
specifically. But they're telegraphing, right on the tin, that this new
service will indeed recommend solutions of the form "we see a problem pattern
in your code; try Amazon _____". The fact that Corretto is actually open
source further muddies the waters.

~~~
weego
We'll need to see what comes out in the wash. Maybe the more OSS they
encounter the more suggestions it will be able to make. Or maybe not and it
will indeed be a sales pitch masquerading as a feature.

------
ReidZB
The code review feature seems too expensive to run on every PR automatically
(to me): $0.75 per 100 lines of code. From their example pricing: "if you have
a typical pull request with 500 lines of code, it would only cost $3.75 to run
CodeGuru Reviewer on it." I wonder if it's actually good enough to justify
that price.

~~~
auslegung
That’s incredibly cheap, assuming it provides good suggestions. How much time
does it take you to review 500 lines of code change, and what’s your time
worth? If it takes 10 minutes and your time is worth about $20/hour or more,
this service will part for itself immediately.

~~~
EpicEng
Our code reviews are far more "is this the right way to solve the problem?"
than "hey, you never use that variable you declared." The latter would be
picked up by our linter; I'm having a hard time seeing the value proposition
here.

>It’s like having a distinguished engineer on call, 24x7

I don't believe that, regardless of how many times they sprinkle in the words
"machine" and "learning".

~~~
dickjocke
I'd be very surprised if the service they've announced is a linter.

The announcement says it can even analyze parts your code that are more
computationally expensive than they need to be. I'm not sure I understand the
skepticism--surely they have among the largest code repositories in the world.
Why couldn't they train models on it to look at best practices and even
compare code practices to different metrics.

~~~
ShamelessC
Isn't this just calculating the cyclomatic complexity?

~~~
ralmeida
No - OP meant _computationally_ expensive, not _cognitively_ expensive. Two
nested for-loops can be O(nˆ2) but can have a cyclomatic complexity as low as
1.

------
veselin
I found what it generates.

[https://github.com/pediredla/Algorithms/pull/3/files](https://github.com/pediredla/Algorithms/pull/3/files)

It looks like a linter, but maybe there is more.

~~~
faitswulff
I've never seen a linter tell me problems with code in this detail before:

> You are using a `ConcurrentHashMap`, but your usage of `get()` and `put()`
> may not be thread-safe at lines: __110, 113, 135, and 137 __. Two threads
> can perform this __same check at the same time __and one thread can
> __overwrite __the value written by the other thread.

~~~
stock_toaster
staticccheck for Go does _some_ of this kind of thing (documenting improper
usage).
[https://staticcheck.io/docs/checks](https://staticcheck.io/docs/checks)

~~~
Yeroc
SonarQube does as well. I've been shocked at some of the analysis it does.

------
farslan
Next step is to provide automated fixes. I've have a side project that does it
for Go source code: [https://fixmie.com](https://fixmie.com) (have plans for
other languages and protocols).

But due my Visa situation here in the US (H1B), I'll be never able to monetize
it as it's illegal to have a side income. But I think this is just the start
and there is an huge opportunity for new startups and projects.

~~~
chirau
It is not impossible for you to earn second income on H1-B, it's just that the
secondary source would need its on visa petition.

DISCLAIMER: I am not an attorney. More importantly, I am not your attorney.
The above is not legal advice. If you desire legal advice, consult a
competent, licensed attorney in your area.

~~~
falcor84
What's the probability that such an unsponsored visa would be approved, and
within even a remotely relevant timeframe?

~~~
8ytecoder
Zero. H1B, by definition requires a specialty _and_ a sponsor who can fire you
(I don't know the exact phrasing but this is what prevents someone with an H1B
from starting their own company)

~~~
chirau
Before I proceed...are you sure and how sure are you? Would you like to
entertain a bet against that nonsensical zero of yours, as if you actually
know this.

I am always fascinated by deducive ignorami who pretend to have done the
research, like yourself.

------
013a
Something doesn't sit right with me concerning their use of "over 10,000" open
source projects on Github to train the AI, then immediately turning around and
telling those same projects "thanks, that'll be $0.75/100 lines of code
scanned."

I feel like this should have a generous free tier for open source projects. I
feel that very, very strongly.

------
eigenvalue
They should compute the SHA512 hash of lines of code or code blocks from well-
known open source projects and then just give you pre-computed "reviews" for
those lines/blocks, and then only charge for "novel" code. Otherwise you would
need to waste time segregating your original code from the various packages
you use. And it seems unfair to charge customers for canned results that can
be cached and served at very low cost.

~~~
drchewbacca
I think you can set it to only scan when new pull requests are made. So you
could commit your libraries etc without asking for review and then turn it on
only for code you have written.

I might be wrong though.

~~~
randomidiot666
Yes obviously you would just choose not to submit those irrelevant PRs to this
extremely overpriced linter (it's not a code reviewer)

------
shubidubi
Code review is not linter. Code review is a chance to discuss design, scaling,
trade-offs and mentor others. I don't think this solution will offer it.

~~~
MisterPea
I think this does what you mention and not the former. I would imagine this
works best when you have a codebase that heavily utilizes the AWS SDK so it
can internally 'paint a picture' of what's going on and provide better
architectural decisions and other best practices.

How well it works is beyond me though

~~~
randomidiot666
Bullshit. You are vastly overestimating the "intelligence" of this overpriced
linter. It mechanically detects patterns. See this example:
[https://d1.awsstatic.com/re19/Screenshot_Catch-Code-
Issue_2%...](https://d1.awsstatic.com/re19/Screenshot_Catch-Code-
Issue_2%20-%20Annotations%20LP.df0deb64bbfb02219d429db4d5bd3efd089e2a89.png)
The kind of human-level artificial intelligence that you're suggesting this
would have, is science fiction.

~~~
MisterPea
Well I stand corrected, would've expected more from a company that knows all
the best practices for their own services

------
cangencer
I wanted to try it on a single repository, but it requested access to _all_
repositories, public or private and also needs admin access for webhooks. No
thanks.

------
udkl
Reminds me of the security analysis tool from FB

[https://engineering.fb.com/security/zoncolan/](https://engineering.fb.com/security/zoncolan/)

[https://www-wired-com.cdn.ampproject.org/c/s/www.wired.com/s...](https://www-
wired-com.cdn.ampproject.org/c/s/www.wired.com/story/facebook-zoncolan-static-
analysis-tool/amp)

Also reminds me of sonatype or findbugs which does something similar but works
on a set of rules instead of on ML.

------
richnich
There is ZERO lock-in with this service. If people want to continue to do hand
code reviews themselves, they can. There is also ZERO in Amazon's announcement
that implies this is just about improving code specific to Amazon's platform.
Stop being open source purists and start understanding reality. Many (most?)
of the posters about this topic critiquing Amazon live in open source unicorn
land, and seem to have almost no understanding of business realities. (Sorry
for not saying what I really think but I thought I should be polite.)

------
totally
If you squint this looks like a baby step towards computers writing code.

~~~
JustSomeNobody
Not even close. We'll have to nail down how to write unambiguous design
specifications in a format that the AI can consume, first.

~~~
aripickar
Maybe we could come up with a consistent language to tell machines what to do
first.

------
mping
I'm wondering did anyone actually tried it? It's not impossible for an
automated tool to give valuable feedback over a PR guys. Should be easier than
a self driving car I guess

~~~
TheOperator
Chess masters have long been combining human and computer analysis... even
before computers were able to actually beat humans at chess.

------
efdee
Poor naming choice, considering CodeGuru.com has been around for what,
decades?

~~~
mavsman
I worked on the relaunch of Cloud9 --> AWS Cloud9 and there was an extensive
naming process shortly before the launch of the service, both for AWS Cloud9
and for the term "environments" (which was previously workspaces).

I can tell you that there were lots of managers, PMs, directors, etc involved
and they considered tons of naming options. They took into account third party
services/products, first party services/products, and other things that might
have overlap. This was likely the situation here and they accepted this as a
drawback.

That said, you're free to disagree and that doesn't mean it was the right
choice, just wanted to point out that this was not an oversight.

~~~
GordonS
> just wanted to point out that this was not an oversight

With codeguru.com being a long established site, doesn't that make it _worse_?

------
CodeSheikh
$0.75 per 100 lines of code scanned per month. Wonder if it automagically
ignores new lines and javadocs?

------
yellowait44
Oh my! Son of Anton is watching my code!

~~~
dipthegeezer
In case people are wondering about the reference.
[https://www.youtube.com/watch?v=xzx5Hwg24xw](https://www.youtube.com/watch?v=xzx5Hwg24xw)

------
SlowRobotAhead
Was there a list of supported languages somewhere? I couldn't find it.

~~~
hsaliak
Only java, it's in the FAQ.

------
gcbw3
If it was minimally not garbage, they would have run it on any high profile
open source project and promoted the results.

Hence, it is pure garbage.

~~~
millstone
Many (most?) of the checks appear to be specific to Amazon libraries.

------
GordonS
> CodeGuru is inexpensive enough to use for every code review and application
> you run

> For example, if you have a typical pull request with 500 lines of code, it
> would only cost $3.75 to run CodeGuru Reviewer on it

Wat?!

Come on, $4 per review is _not_ inexpensive, especially for what is
essentially a glorified SAST!

------
seanwilson
Is there a reason AWS products have names where you can rarely guess what they
do?

Why not something more obvious like one of AWS Code Auditor/Reviewer/Checker?

------
z3t4
Code review if done right is a place to learn. Criticizing (and automatically
fixing) code style issues and nit-picky can be done by a machine.

------
ecuaflo
Why do this in the code review stage as opposed to in the code editor linting
stage? I'd wanna have these suggestions before pushing.

------
eddywebs
An open source alternative is PMD tool -
[https://pmd.github.io](https://pmd.github.io)

------
miheermunjal
Assuming trust in the AWS code reviews (I mean, that dataset is huge), I
suspect this has use in the review portion even without considering profiling.
Hoping there is more detail on the ML used, as it appears more adaptable than
current rule based code reviewing solutions... here come more and more Dev-
focused integrations coming to the code level

------
whb07
Neat idea and a good place to start. But most of the time, people are
willingly and sometimes violently opposed to automated free tooling like
linters, formatters etc. I’m not holding my breath for that cohort (majority
of devs).

------
djhworld
I got to preview this service (the code review service) a few weeks ago.

The best thing about it was the recommendations on how to use the AWS SDK
better as that's probably got the most potential to drift or make mistakes on

~~~
buboard
considering how it can be a driver for AWS sales, they should give the service
for free though

------
conwy
It seems good for performance, and probably only that. I don't see anything in
the pitch about readability, maintainability, extendability, security,
usability/accessibility or portability.

------
amai
What can it do better compared to solutions like
[https://www.sonarqube.org/](https://www.sonarqube.org/)?

------
mirchibajji
If anyone from AWS is reading this, is there a plan to support GitLab? We have
a self-hosted GitLab enterprise on AWS, and wondering if we can try this out

------
dchichkov
Now, is it a code review with ML, or is it a data collection service with a
human backend. That wants to be code review with ML ;)

------
rodgerd
Your programming job is Jeff's opportunity.

------
formalsystem
Is this vaporware?

------
scblock
Please do us a favor and consolidate all these Amazon announcements into a
single announcement page link. This is ridiculous.

~~~
dang
This always comes up when the big tech cos do their annual conference day
thing. We don't consolidate the posts, but we do downweight some of them, so
you're actually getting less (Amazon|Google|Apple|Microsoft)iness then the
system would otherwise be letting through.

[https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...](https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=by%3Adang%20annual%20conference%20day&sort=byDate&type=comment)

------
CodeSheikh
"Amazon CodeGuru is a machine learning service for automated code reviews and
application performance recommendations. It helps you find the most expensive
lines of code that hurt application performance..." I suspect if AWS is using
customers code bases to train its AI models? Another source is to scavenge
open source repositories.

~~~
alexithym
"CodeGuru’s machine learning models are trained on Amazon’s code bases
comprising hundreds of thousands of internal projects, as well as over 10,000
open source projects in GitHub" \- from the article.

~~~
mirekrusin
Oh no, the code quality is going to be shit.

~~~
eximius
They might be superficial, but if they did any sort of supervised training
(I'm assuming they did), then they probably won't be _wrong_.

