
AWS X-Ray – Distributed Tracing System - Trisell
https://aws.amazon.com/xray/
======
devbug
If you're interested in distributed system tracing there's a lot going on.

As a starting point, I would recommend reading Google's paper on their project
"Dapper." [1] It's essentially the core of most distributed tracing systems.
At least those I've encountered.

There's a lot of tooling out there that take their cues from Dapper. I've
recently been looking into integrating OpenZipkin[2] with our systems. I see
at as a more viable alternative (no tie in!) to Yet Another Propriety Thing in
AWS (YAPTA). There's other as well, like AppDash.

Recently there has been a push towards an open-standard for the collection
side called OpenTracing[3]. I came across it when investigating LightStep[4].
Ideally that means no vendor lock-in, which of course has lots of knock-on
effects.

If you know of any, I'd love to be pointed in the direction of _different_ and
not just divergent techniques.

[1]:
[http://static.googleusercontent.com/media/research.google.co...](http://static.googleusercontent.com/media/research.google.com/en//archive/papers/dapper-2010-1.pdf)

[2]: [http://zipkin.io](http://zipkin.io)

[3]: [http://opentracing.io](http://opentracing.io)

[4]: [http://lightstep.com](http://lightstep.com) \- Impressive team behind
this.

~~~
mmclean
Stackdriver Trace [1] is Google's external implementation of Dapper with
additional analysis features. It's available for free, even for workloads not
running on Google Cloud Platform.

(Disclosure: I work on it)

[1]: [https://cloud.google.com/trace/](https://cloud.google.com/trace/)

~~~
tonyhb
This is awesome! Can Google update the docs so that the links for external
zipkin integrations work? They currently link in circles and there's no info
on how to integrate into OpenTracing/Zipkin

~~~
mmclean
Uh oh - are you referring to the docs on GitHub or the actual docs site
([https://cloud.google.com/trace/docs/zipkin](https://cloud.google.com/trace/docs/zipkin))?

~~~
mmclean
Nevermind, I think we found it. Fixing now.

~~~
tonyhb
This part:
[https://cloud.google.com/trace/docs/zipkin#configure_zipkin_...](https://cloud.google.com/trace/docs/zipkin#configure_zipkin_tracers)

It doesn't show zipkin configuration :)

~~~
mmclean
Yep, an issue with the anchor links that'll be updated soon. It's supposed to
just link to a lower part of the page.

If you scroll down you'll see a top-level header also titled "Configure Zipkin
tracers" that describes how to configure the Brave tracer. You don't really
need to do anything special here - just point your Zipkin tracers at the
Stackdriver Zipkin Collector rather than your existing one.

------
ranman
This blog post has more info: [https://aws.amazon.com/blogs/aws/aws-x-ray-see-
inside-of-you...](https://aws.amazon.com/blogs/aws/aws-x-ray-see-inside-of-
your-distributed-application/)

I'll be talking about this on the
[https://twitch.tv/aws](https://twitch.tv/aws) stream at 12:30 pacific if you
guys want to ask questions / learn more.

(I WORK AT AWS)

~~~
simook
^ Confirmed

~~~
Trisell
fixed!

------
andrewvc
I haven't used many of the new AWS services. Someone tell me, what's the
quality level? I'll be surprised if all the new AWS stuff works that well
given how divided their focus must be now.

~~~
brazzledazzle
AWS makes me wonder what the future of open source infrastructure will look
like. I'm sure they contribute back at least a little but providers have
little reason to contribute back the distributed/scale modifications they
make. They can take any project with a friendly license and essentially siphon
away its future user base by turning it into a service. It's basically what a
lot of open core companies do but it's different since those companies are
usually the creators/maintainers so they have a vested interest in preserving
and maintaining it.

I'm probably over thinking it but big bang announcements of multiple products
does make me wonder.

~~~
pquerna
> I'm sure they contribute back at least a little

Almost from 2 years ago:

[https://news.ycombinator.com/item?id=9358843](https://news.ycombinator.com/item?id=9358843)

But gist is, Amazon employees are generally discouraged from contributing to
open source at all.

~~~
brazzledazzle
I know they do a tiny bit here and there (there was a presentation here at the
re:Invent conference about it that I didn't get a chance to attend) but their
usage of it internally probably dwarfs those contributions.

It makes business sense even if it's a shitty thing to do. The stronger an
open source project is that provides the core functionality that your managed
service provides the easier it is for a competitor to build the same thing.
Amazon isn't as cutthroat as Oracle but they're certainly much less OSS
friendly than many of their competitors.

------
DenisM
Turns out X-Ray is _sampling_ service, it drops data. You can use it to debug
recurring problems, but it's no use when debugging a particular incident with
the customer on the phone. Bummer.

 _To provide a performant and cost-effective experience, X-Ray does not
collect data for every request that is sent to an application. Instead, it
collects data for a statistically significant number of requests. X-Ray should
not be used as an audit or compliance tool because it does not guarantee data
completeness._

[https://aws.amazon.com/xray/faqs/](https://aws.amazon.com/xray/faqs/)

~~~
dkuebric
Yeah, there's a big difference between something that's pure distributed
tracing (this) and something that does metrics/monitoring as well (TraceView,
New Relic, ...). Sampling is a valid way to make distributed tracing scalable
and performant, but it can also limit the use-cases for the data.

I imagine this being used for episodic debug cases, where one could turn up
the sample rate and pay only for the traces captured/queried during an
incident. But you would need to do your monitoring and trending separately.

I wouldn't be surprised if they eventually start integrating this better with
CloudWatch for that reason, though it doesn't seem to be doing any of that
today.

Disclosure: I work on TraceView (traceview.solarwinds.com) which is
distributed tracing based APM product. We're inspired by Google Dapper and
x-trace, both mentioned elsewhere in this thread.

~~~
DenisM
> difference between something that's pure distributed tracing (this) and
> something that does metrics/monitoring as well (TraceView, New Relic, ...)

I don't see the distinction you're making, as neither seems to record all
traces for accurate audit. Am I missing it?

~~~
dkuebric
I was speaking to a more general monitoring approach where you might want to
know p99 latency, request volume, error rate, etc, for each service to use in
alerting and trending. This is a common use-case for application monitoring
that isn't addressed well by a pure tracing approach.

If you're looking specifically for the 100% audit trail case, I'd look at
DynaTrace or potentially Instana.

------
ropiku
Looks like it doesn't support all languages. Not sure if users can contribute
plugins.

> You can use X-Ray with applications written in Java, Node.js, and .NET that
> are deployed on these services. Support for AWS Lambda is coming soon.

------
koolba
So is this AWS's New Relic competitor?

~~~
wsh91
Well, there _is_ Stackdriver Trace for free already on Google Cloud Platform.
(Disclosure: I work on GCP, albeit not on that team.)

~~~
tonyhb
And Stackdriver is Zipkin compatible, which is awesome. X-Ray doesn't seem to
support Zipkin nor Opentrace which is limiting.

------
loganbertram
Honest question: why is there so much AWS news up today? I mean, X-Ray is
particularly exciting for me, but there are currently 5/30 stories on the
front page which are basically just product announcements. I'm pretty new, but
is this normal? I thought the basic upvote criteria was meant to be that we
should focus on upvoting articles of some depth as well as interest.

~~~
iamed2
There's a big Amazon event (AWS re:Invent) and they are announcing an
extraordinary number of new tools. Think an Apple or Google event but one
where they announce a new product every 15 minutes. People with many different
kinds of jobs are excited for different reasons. It's double-Christmas for
people who work extensively with AWS.

------
spo81rty
Some have mentioned is this full APM and does this compete with New Relic,
Dynatrace, Stackify, Appdynamics, App Insights, etc.

Those products are primarily based on code profiling. For example, at Stackify
we automatically profile key methods for dozens of common dependencies and
frameworks to understand their usage and performance. Every SQL, NoSQL,
caching, queuing providers and many other things. Plus app errors, logs, etc.

So the best I can tell from the AWS blog and docs is the answer is no its not
a full APM. It appears to track how long a web request takes and any usage of
the AWS services via their SDK. More of a lightweight service mapping of AWS
services. So SQL database or HTTP calls probably aren't tracked.

In the future could they expand it? Sure. But for now it seems limited
compared to a full blown APM product. Although this could be help for
identifying performance problems with AWS services.

Matt - Founder of Stackify

------
inthewoods
Significant blow to all APM players in the space. Approximately 50% of New
Relic's revenue comes from AWS. Datadog just announced their distributed
tracing system at $25/host. X-Ray is orders of magnitude cheaper than either -
and per trace pricing is something that would be very tough for anyone else to
do and plays well to Lambda efforts.

X-Ray seems like a pretty basic service at this point but in my opinion the
writing is on the wall for other APM providers. Amazon chose this announcement
for a major slot so I expect they'll be investing in the service. As Bezos is
often quoted as saying "your margin is my opportunity."

~~~
puzzle
Where was Datadog's pricing announced?

~~~
dkuebric
Not sure about public announcement but I heard the same from someone at their
booth at re:invent.

------
xrd
Incredible. My book talks about writing tests to help build reliable systems
but having a dynamic run time tool like this is an astounding addition to a
developer's toolset.

------
Xorlev
Hooray, hosted Zipkin. Would be really nice to have a opentracing backend that
sends to AWS X-Ray.

~~~
pritianka
An OpenTracing <> Aws X-Ray integration would rock. I work on the project and
would love to collaborate with anyone interested.

------
polskibus
Is this offering similar to Azure's Application Insights?

[https://azure.microsoft.com/pl-pl/services/application-
insig...](https://azure.microsoft.com/pl-pl/services/application-insights/)

------
btashton
No Python support?

~~~
EugeneOZ
No Rust support, what is even more unexpected.

~~~
allengeorge
Why is that unexpected? As much as I love Rust - it's not exactly a top-10
language right now.

If there's no Python support - that shocks me.

~~~
EugeneOZ
It's the language #1.

------
simlevesque
Just when I finished setting up NewRelic on Elastic BeanStalk :P

------
Thaxll
[https://aws.amazon.com/xray/details/](https://aws.amazon.com/xray/details/)

------
deepnotderp
Damn, AWS is on a tear RN.

------
res0nat0r
Is this comparable to Dynatrace?

~~~
the_aLgo
No, X-Ray is no APM solution (yet). For example, it is missing End-User
Experience Monitoring (for more information see
[http://www.apmdigest.com/gartners-5-dimensions-of-
apm](http://www.apmdigest.com/gartners-5-dimensions-of-apm)).

The similarity to Dynatrace is that - like the PurePath - X-Ray enables
distributed tracing. But the approach is different: whereas Dynatrace
instruments your application using an agent^, X-Ray needs you to instrument an
application using the provided SDK. Moreover, X-Ray seems to be limited to
applications communicating via HTTP(S).

^ in the most cases

------
henrygrew
I was expecting python support

------
honkhonkpants
Hrmm name collision with Google's previously published XRay function tracing
system
[http://research.google.com/pubs/pub45287.html](http://research.google.com/pubs/pub45287.html)

~~~
Zikes
Same for AWS Shield & Google's Project Shield.

------
dboreham
Ugh. Of course I read the comments first to see if the article is worth
reading and it takes me 10 minutes to realize this is not Amazon's 3D
rendering as a service, service..

