K6 is my go-to for performance testing. What drew me in is that you can convert Postman collections to k6 scripts (https://github.com/loadimpact/postman-to-k6) if you already have Postman collections. If you don't have any Postman collections, you can just start with k6 and you can have a simple script up and running in under a minute. I've used k6 at work and for personal projects testing Scala, Go, and Python (I guess the language you're testing doesn't matter b/c you're just hitting http endpoints). If you're doing testing that requires getting an OAuth token, there's a flow for that. I've had nothing but good experiences and would heartily recommend k6 if you're looking to performance test!
Over the past 10 years I have glued together project-specific variants of "parse some proxy or WAF log and spit out test or automation data" at least 4 or 5 times. Including as recently as a couple of weeks ago:
Knowing HAR files existed would have saved me quite a bit of work and left me time to improve my automation or testing. I'm very happy to have discovered this.
You can use it to archive websites and recreate them offline, Internet Archive supports HAR files for archiving.
You can use them to recreate user scenarios for acceptance testing and performance tests, via tools like K6 (K6 which also supports InfluxDB, so you can gather the metrics yourself and have your own UI [via Grafana or anything else that supports InfluxDB])
You can use them for troubleshooting all sorts of user/client issues if you have a easy way of gathering the HAR files, or if your users are developers.
They're simply a invaluable tool in web development these days, if you deal with web frontends/backends.
> JavaScript is not generally well suited for high performance. To achieve maximum performance, the tool itself is written in Go, embedding a JavaScript runtime allowing for easy test scripting.
How is it possible that pure go JavaScript interpreter (goja) with bindings for net/http and some reports would be faster than the same tool written in nodejs using its http-client (which if I remember correctly is written in C)?
I don’t mean to downplay the importance or usefulness of k6, I just find their reasoning behind choosing go somewhat contrived
I am not completely sure why the Go stdlib's HTTP client (which k6 uses) is faster than the NodeJS one. I think part of it is the fact that k6 spins up a separate JS runtime for each VU. goja is a much, _much_, slower JS interpreter than V8, but load tests are IO-bound, so that's usually not an issue. And you can spin up thousands of VUs in k6 (especially with --compatibility-mode=base), making full use of the load generator machine.
What do you mean "very long article"? You could have used "extensive" or something ;)
Anyway, what I've seen when comparing the performance of tools, is that Artillery, which is running on NodeJS, is perhaps the wors performer of all the tools I've tested. I don't know if it's because of NodeJS or that Artillery in itself isn't a very performant piece of software (It also consumes a lot of memory, btw).
If you want the highest performance, there is one tool that runs circles around all others (including k6), and that is wrk - https://github.com/wg/wrk - very cool piece of software although it is lacking in terms of functionality so mostly suitable for simpler load testing like hammering single URLs.
(I don't know how fast wrk2 is, haven't benchmarked it)
Oh yeah, that's my own load testing tool, written in the high-performance language Bash. It is a bit feature sparse but produces about the same RPS numbers as Drill (https://github.com/fcsonline/drill) and not too far behind Artillery. And it is about 0.0025 times as fast as Wrk!
Many benchmarks say JS-VMs are blazingly fast, at the same time others say they are battery-hogs. And Apple is even optimizing their CPU for it to run faster.
IMHO that is about JS-memory usage and GC, which never really get's proper attention.
You can have really fast algorithms, if you keep creating and forgetting millions of objects, which most JS-frameworks do, you will have lags and GC-pauses.
As a C++ developer who writes a lot of unit tests and cares a lot about performance, the tag line sounds appealing, but I can't tell whether this is something meant for me.
Is this a general tool, or is it specific to web development? e.g. Would this be an appropriate tool for Blender to detect performance regressions in their rendering?
We (Insightful) are developing a tool for you! :) It's called Insightful PerfLab - it will work for native code (CPU and/or GPU) and is designed to help things like Blender detect performance regressions in rendering (we built it to help detect performance regressions in our game engine, and then realized it would likely be generally useful to other developers).
Yes, that's it. If the C++ application exposes some service, then k6 might be the tool to test how it behaves under load now (or in the future). The only protocols we currently support are HTTP(S), WebSockets and gRPC, though we plan to add a lot of others (raw TCP/UDP, various DBs, queues, messaging, DNS, etc.) in the future, and we've recently added a way to extend the k6 functionality by writing external Go modules:
- https://github.com/loadimpact/k6/blob/master/release%20notes...
- https://github.com/k6io/xk6
I'm glad this was posted. I tried it this evening and it worked great. It took me like 20 seconds to get started, and I was hooked. (A few more minutes were required to setup a tsconfig.json so that my editor could type-check the k6 imports. I don't think the documentation really says "add a tsconfig with compilerOptions.types = [k6]" anywhere, but that worked for me. Writing the tests in Typescript instead of Javascript is an exercise for another day.)
I like the design of running each virtual user in its own Javascript VM inside of a Go process. I can just install a single binary, but still write tests in a high level language (and not rebuild the load testing framework for every change).
An idea that's been kicking around in my head for a while is extending Go applications by embedding a WebAssembly VM, exposing relevant hooks, and letting users add whatever they want at runtime. That way, you don't have to bundle a bunch of crap into a statically linked binary, users can go add that later :) K6 seems to validate that this is a good idea; while not WebAssembly, it sure works well. (I would want to write my plugins in Go, not Javascript... but the idea is good.)
I also discovered K6 a few days ago, it looked quite nice and could be a good replacement for our Gatling tests.
We were initially looking for something slightly different though: we were interested to have perhaps less tests, but tests that would run much much more often (like every seconds or couple of seconds), in a continuous manner. Tue goal was to have something at the same time like a healtcheck (is it still working), like a performance test (does it answer in a timely manner) and like a validation test (does it answer the right result - the endpoints we wanted to test do "complex calculations").
Our best answer so far was to wrap K6 in an infinite loop, but I wonder if there could be something smarter.
I might be missing something, but k6 should be able to completely cover all of your use cases? I am one of the k6 developers, can you share exactly what the missing piece was?
> tests that would run much much more often (like every seconds or couple of seconds), in a continuous manner.
You can do that, just use an arrival-rate executor that runs an iteration every second, with a test duration of 365 days or something like that :) See https://k6.io/docs/using-k6/scenarios/arrival-rate
> Tue goal was to have something at the same time like a healtcheck (is it still working), like a performance test (does it answer in a timely manner) and like a validation test (does it answer the right result - the endpoints we wanted to test do "complex calculations"). Our best answer so far was to wrap K6 in an infinite loop, but I wonder if there could be something smarter.
You can certainly wrap k6 in an infinite loop. Nothing wrong with that, though you can probably use the `scenarios` feature (with long `duration` values) to achieve it without wrapping k6: https://k6.io/docs/using-k6/scenarios
Gatling is super slow. Suspicious slow in fact. You will likely not be able to really hit your endpoints hard enough by using one machine only.
Personally I use autocannon which in my testing is faster than k6 and easier to use with a node.js environment. It is only a npm package, not a full software you need to install on your machine. Autocannon in my testing is 2 orders of magnitude faster than gatling, but you should do your own testing. PS. I have no affiliations to autocannon, I'm just a very happy user.
The only drawback compared to k6 is lack of websockets, which is unfortunate.
2 orders of magnitude faster than Gatling? Would be interesting to see how you ran your tests, and the results.
In my testing, Gatling isn't catastrophically slow by any means. Wrk, which is faster than any other tool I've seen, is about one order of magnitude faster than Gatling, in terms of raw RPS generation. I find it very hard to believe a NodeJS-based tool executing JS would be faster than the fastest tool written in C that just hammers static URLs.
What I like too, is that there is a datadog connector. It sends the number of VUs and we can correlate that in our main datadog board, viewing how the app reacts to the load.
I've been using this for nearly a year and it's by far the best thing I've found so far for testing realistic websockets load where servers are put under strain not just by connections but interactions between each other. Many thanks for the dev team behind this. It's a great piece of work.
This thread is finest example of why I like this community so much. Been using this tool for quite some time and I thought I figured out most of the stuff. Then I noticed this thread and saw there's even more useful options and functionalities. Thank you HN.
Was a big fan of using LoadImpact for years and started testing with K6 as LoadImpact deprecates. (I use them to benchmark WordPress Hosting company performance: https://reviewsignal.com/blog/2020/06/01/wordpress-hosting-p...) I have to say overall the experience was pretty good. I was able to write near identical scripts easily. The documentation was good. Overall pretty happy with K6 as a successor and stand alone. I've used a lot of load testing tools and it's definitely one of the best out there.
If you statically or dynamically link any part of the k6 codebase with some other code (a derivative work), the license’s copyleft virality is triggered and that other code also needs to be made available under AGPLv3. When it comes to using the k6 binary and interacting with it from another process or over a network it is not certain exactly how the copyleft virality would apply from what I’ve been told by lawyers with OSS license knowledge.
Thanks. I must say, I prefer weak copylefts for my projects (MPLv2 with the incompatibility clause), so my concern isn't copyleft per se, but the virility of it.
That said, I do have a couple questions:
- Would a non-AGPLv3 open-source or source-available project using K6 be incompatible with AGPLv3 given they might "distribute" CI/CD along with their open-source or source-available code?
- If a closed-source project publishes results from K6, are they now in violation of AGPLv3?
I'm not a lawyer, so take that into account when reading my reply :)
> Thanks. I must say, I prefer weak copylefts for my projects (MPLv2 with the incompatibility clause), so my concern isn't copyleft per se, but the virility of it.
I need to read up some more on MPLv2, but from what I've read now it looks like a license that could fit k6 as well.
> - Would a non-AGPLv3 open-source or source-available project using K6 be incompatible with AGPLv3 given they might "distribute" CI/CD along with their open-source or source-available code?
No, using k6 as a tool in CI/CD would not be a violation of AGPLv3 or trigger the virality of the license in regards to the non-AGPLv3 open source or source-available code base. Any k6 test scripts you write would also not need to be licensed under AGPLv3. Gitlab built an integration with k6 which could be seen as a data point in agreement with this view: https://docs.gitlab.com/ee/user/project/merge_requests/load_...
> - If a closed-source project publishes results from K6, are they now in violation of AGPLv3?
If we're talking about publishing results from testing of the closed-source project, then no, that would not be in violation. There's however an undefined gray area around clause 13 "Remote Network Interaction" in AGPLv3 in terms of what is allowed when for example offering a SaaS product based on an AGPLv3 licensed component such as k6. Clause 13 has not been tried in court afaik. It could seem that clause 13 would provide some protection for a business like ours from other companies building commercial solutions on top of k6, but from what I've been told by lawyers we shouldn't count on that, so we're not (anymore).
We have had many internal discussions on this topic, as well as with other companies and individuals in the open source space; whether to go the route of a source-available license, effectively restricting commercialization possibilities of k6 by other companies, or going the open core route and restrict what we release as open source. Everytime we've had this discussion internally we've come back to open source being the right choice for us, given the type of product we build and where we think we can capture value.
Thanks a lot for being so thorough with the answers. (IANAL either but) Given the above, I think you'd be fine with AGPLv3.
MPLv2 and other related licenses such as Eclipse Public License v2, Erlang Public License, do have the advantage of being well understood and in some cases auto-approved for use at various enterprises and thus a good midway between MIT / Apache and GPLv3.
That said, you'd be right to lean more-copyleft (going Server Side Public License, for example) if K6 is a key product (and not a complementary product), though Bryan Cantrill thinks you might be better off closing up the source in that case: http://dtrace.org/blogs/bmc/2018/12/14/open-source-confronts...
Member of team artillery.io here, my 2c on the subject. We chose MPLv2 specifically to address licensing concerns. With MPLv2, you can build on top of Artillery, you can build plugins and extensions for it, and integrate it into your systems without worrying about licensing. It's a well-understood license with very clear boundaries between your code and MPLv2-licensed dependencies (e.g. all Hashicorp tools use MPLv2).
That's unfortunately not the case for AGPL. There's a reason a lot of companies have policies banning any AGPL dependencies outright, Google probably being the most prominent example: https://opensource.google/docs/using/agpl-policy
AGPL is designed to be extremely viral and has not been tested in court. The definitions of boundaries between your code and AGPL-licensed code are not well understood. Consider that MongoDB, probably the most popular AGPL-licensed project in use before 2018 when they switched to SSPL, had to explicitly publish their drivers under Apache because communicating with an AGPL dependency over a network was not sufficiently distant enough to prevent infection. https://www.mongodb.com/blog/post/the-agpl
As engineers we need to be aware of licensing implications of code we depend on. I am obviously not a lawyer, and obviously your employer's legal team are the people you should be talking to if you have an absolutely critical dependency on an AGPL-licensed project.
It’s become the go-to load testing tool at my company. Personally I’ve only used it lightly, and enjoyed the experience. And others at my company who do heavy load testing rave about it.
I would gladly recommend this to anyone needing to load test their endpoints. We were using Gatling at work to do load testing and it was very slow to develop in it as we were lacking experience in scala. We found this and never looked back. The impact it had on benchmarking, profiling and tuning the servers was massive. I'm pretty sure the company saves a few thousand dollars every month till today because of those optimizations.
I use it since 6 months to test a kubernetes micro service application. I love the model, when you write CODE for your tests. We code the tests in typescript.
We are using Soasta at your company. We use it to do record, parametrize, playback to simulate multi user workflows. Cant see any such support here. One of the problems with Soasta is it requires lot of servers to trigger load. Looking for moving away from it. k6 seems interesting, but couldn't find any record/playback functionality. Any addon/extension which can be used to achieve similar purpose with K6?
However, you don't need a cloud subscription to run the tests, you can copy-paste the generated script code and `k6 run` it locally, without paying us anything.
I am one of the k6 developers. I've commented in a bunch of threads here, but if you have any questions, feel free to AMA :) I'll be around for the next ~1h and then probably later in the day as well.
Is there some way you can do whatever it is AWS CDK does under the hood to provide language bindings to other languages for writing tests? That'd be killer. I want to be able to write the tests in my backend language.
Unfortunately not... At some point in the far future, we might release support for Go "scripting" [1] or we might extend xk6 [2] so it allows different script types, but those things are far from certain.
I've used K6 as well. I like it bcoz it's code, but my (major) complaints are lack of auto-complete (intellisense) in IDE and debugging. Without those two, it's hard to do exploratory testing.
Regarding debugging, that's unfortunately unlikely to come any time soon... We'd need support for that in the JS runtime we're using (https://godoc.org/github.com/dop251/goja) and then we'd need to figure out how to expose in in k6, while dealing with potentially hundreds of concurrent JS runtimes (VUs). Not impossible, just unlikely to land anytime soon...
Artillery team member here. "Rather pricey" really depends on what you compare it to. Artillery Core is free. Artillery Pro costs money, but it runs directly in your cloud environment, so it's extremely cost effective compared to hosted SaaS solutions. We designed it for large-volume use, especially in CI/CD pipelines. There are no limits on test minutes, vuser concurrency, number of tests you can run etc.
Can you DIY a solution that will let you instantly scale up from running tests locally to running them on 500 workers in any of 13 different geographical regions? With no servers to manage or maintain whatsoever. Sure you can, but it's not the best use of time for a lot of teams.
It seems like the hard part is deciding when a change is suspicious enough to be investigated? Simple pass/fail tends not to work at scale because complicated systems tend to be noisy.
See the example in https://github.com/loadimpact/k6/#checks-and-thresholds. If, at the end of the test run, some of these rules (encoded as `thresholds`) were unsatisfied, k6 will exit with a non-zero exit code (and thus fail your CI check):
- The 95th percentile of all HTTP request durations should be be less than 500ms
- The 99th percentile of all HTTP request durations that were tagged with `staticAsset:yes` should be less than 250ms
- The failure rate of the checks should be less than 1%, although if it's more than 5% the test will abort immediately.
Does K6 deal with the coordinated omission problem?
Gil Tene (Azul Systems) has argued convincingly [1] (slides [2]) that monitoring tools get latency measurement wrong because sudden spikes aren't represented correctly in timings because of averaging and the wrong use of percentiles.
He argues that percentiles simply aren't useful, because, statistically, most requests will experience >= 99.99-percentile response times. All percentiles lie; the only truly useful and realistic number is, in fact, the 100th percentile.
He also argues that the most revealing way to show latency is with a complete histogram.
I haven't used K6 yet, but I've been looking for a good load-testing tool, and intend to try it out.
This used to be an issue with older k6 versions, but since v0.27.0 we have arrival-rate executors [1] that should address the biggest issue of coordinated omission. And we've always measured the max value of all metrics, even when they are not exported to some external output and we just show the end-of-test summary.
It has been a while since I watched these Gil Tene talks, so I might be missing something, but I think the only remaining task we have is to adopt something like his HDR histogram library [2]. And that's mostly for performance reasons, though we'll probably play around with the correction logic as well.
k6 allows the user to choose[1] which metrics are relevant for the particular test. By default, it displays max or p(100), p(95), p(90), min, med, and avg. User can specify other values such as p(99.995)
It's also possible to create completely custom metrics[2] to track whatever is relevant to the user.
k6 allows the user to change almost all aspects of execution, tracking, and reporting.
Yup. I put a couple of metrics for failures of a couple of types in the load script and ship them out to InfluxDb/Grafana for plotting in the very first graph. That way, if I see a lot of "failure" color, flags go up. Otherwise the usual amount of background noise is ok.
I used k6 a few years ago to do some trivial benchmarks.. Nothing in depth.. But it was the best experience so far.. Easy to get started, good documentation and results well laid out.
> k6 is a modern load testing tool, building on Load Impact's years of experience in the load and performance testing industry. It provides a clean, approachable scripting API, local and cloud execution, and flexible configuration.
This sounds more like integration testing than unit testing. Something like Google's micro-benchmarking library [1] is more like unit testing.
Beyond technology stack choices I'd say the biggest differences are that k6 is scriptable in JS whereas Tsung has scripting-like capabilities in XML, and that Tsung supports distributed execution of tests which k6 doesn't support yet (only in our SaaS product). Tsung also supports more protocols than k6 at the moment (eg. AMQP, MQTT, PostgreSQL etc.) but k6 supports HTTP/2 and gRPC which Tsung doesn't so depends on one's needs I suppose.
Tsung was made in another era, before devops. I think it shows in the UX and everywhere when using the tool that it wasn't really meant for developers.
Specifically, and with the disclaimer that I haven't used Tsung so much, I'd say the biggest difference is that k6 tests are scripted in a real language (JS) as compared to Tsung's XML.
But k6 also has a modern CLI UX that beats Tsung (and most other tools) by a mile, more integrations now and possibly support for more protocols, or at least more relevant ones (not sure if Tsung supports HTTP/2 for instance). Plus k6 is being very actively developed.
From what I've seen of Tsung I really like it though. Seems like a very performant and solid piece of software. But like mentioned earlier, it was made in another era. Just like Jmeter. If I ranked tools in terms of ascending "DevOps:ness" and UX for developers it'd probably be something like:
This very much depends on your available hardware and the complexity of the load test. Following the advice in https://k6.io/docs/testing-guides/running-large-tests, and having a beefy load generation machine and network, you should be able to run many thousands of VUs on a single instance pretty comfortably. How many thousands precisely depends on a lot of factors... :)
The most I started on a single machine was about 50.000 VUs. Back then, the limit was in available sockets. k6 now has support for multiple source IPs, so that limit has been effectively removed, and more VUs should be possible.
On the `k6 cloud` side, we have executed 500k+ VUs. The most RPS I achieved with k6 was 4 791 928 (~4.8 million requests per second). That test lasted for 6 min and generated 1.5 billion requests in total.
Interesting. How did you collect metrics for reports when running that many VUS? In a recent attempt, influxdb quickly became the bottleneck for us when testing with around 5000 VUS. Is there some best practices outlined somewhere, or which you could outline here, on how to collect metrics for large runs?
For that local test, I relied on the k6 CLI output.
Influxdb would most likely not be able to handle this amount of data. We are looking to add a TimescaleDB output to enable metric collection beyond what Influx is capable of. There's a working example here if you want to give it a shot ahead of the official release https://github.com/loadimpact/open-source-load-testing-stack
Local execution with cloud output is also possible (k6 run -o cloud), but that's not free.
To give you some idea of what k6 cloud can do, here are screenshots of a few large scale tests https://imgur.com/a/NtXsc1a
The soon to be released "xk6 - k6 extensions"[1] will bring support for many new protocols.
There are already proof of concept extensions for Redis, ZeroMQ, kafka, SQL, and others.