We built Jolt to solve a fundamental problem with AI coding tools - they struggle with larger, real-world codebases and cannot accurately determine the context files for your prompt.
Jolt's public beta is live today, and we'd love your feedback. You can use Jolt on the web or in any VSCode-based IDE. Our free tier will let you work on codebases up to ~100K lines.
We developed a novel way for Jolt to understand large codebases and automatically determine context files. Our approach scales to multi-million line codebases - the largest codebase using us in production is over 8M lines.
Jolt is optimized for codebase understanding rather than response speed. That said, using Jolt is significantly faster than other tools when you include the time spent selecting context files.
Manually selecting context files is a non-starter when you're working on larger codebases or unfamiliar code. It's a broken product experience. Most AI coding tools rely on some flavor of vector-embedding RAG to determine the files related to your prompt. The reality is that vector-embedding search is not effective on code, and there is a sharp drop in efficacy as codebase size increases.
Here's what folks are using Jolt for:
- Writing code, tests, and refactoring
- Onboarding developers
- Brainstorming a feature's implementation
- Asking questions about OSS repos
- Writing documentation, including mermaid charts
- Contributing across the stack
Thanks for reading, and let us know what you think.
Hi Dmitry, thanks for replying here. You raise some good points - I'll do my best to address them below. I'll also add that we have a fully-featured free tier that you can sign up for on our website (www.multiple.dev). Any hands-on feedback from our product or the TestGen feature would be extremely helpful.
Test feedback - during our TestGen flow, the user provides feedback on the sequence and contents of the API requests. And at the end of the flow, our users can manually edit the resulting JS code for additional customization.
Effort to create a load test - You can go from a Swagger or HAR file to a function load test, written in JS, in a few minutes. There is no learning curve, assuming you have basic knowledge of JavaScript. Maintenance is typically minimal.
CLI - we are launching our CLI shortly, where users can start tests from command line as you describe. It'll work similarly to Jest or other unit test frameworks, where the test scripts will live in our user's codebase.
The use of AI - we use AI to generate realistic-looking synthetic data, which can be challenging with strings. The AI matches each field to the most relevant faker-js function. We need the content of the string to look like something the target application would receive in production. And with HAR files, we use AI to help filter out irrelevant requests such as analytics.
I hope that was helpful, and I'm happy to go into more detail.
> Test feedback - during our TestGen flow, the user provides feedback on the sequence and contents of the API requests.
So, it is not fully automated, the user needs to provide the feedback, or is it optional?
Originally by feedback, I meant if there is a feedback loop between the system and the test harness, so the test harness can learn from the system behavior and produce better data / spend less time on ineffective cases. This also is essential for things like test case reduction when a failure happens.
> There is no learning curve, assuming you have basic knowledge of JavaScript. Maintenance is typically minimal.
I'd be cautious about saying that there is no learning curve. Based on the docs at https://docs.multiple.dev/how-it-works/ai-test-gen I see that one who uses the feature should also be aware of your environment API, e.g. `ctx`, `axios`, etc. That does not match my expectations when read about no learning curve and basic JS knowledge. It is not far from there though.
> CLI - we are launching our CLI shortly, where users can start tests from command line as you describe. It'll work similarly to Jest or other unit test frameworks, where the test scripts will live in our user's codebase.
Cool! So, the user needs to commit the test code to their codebase, right?
> The use of AI - we use AI to generate realistic-looking synthetic data, which can be challenging with strings. The AI matches each field to the most relevant faker-js function. We need the content of the string to look like something the target application would receive in production. And with HAR files, we use AI to help filter out irrelevant requests such as analytics.
Yep, thanks for the clarification. I am thinking about how effective such realistic-looking synthetic data is in uncovering defects, i.e. if it covers happy-path with such data, then it left me wondering what about uncommon scenarios? Specifically, if it still can cover uncommon characters (from various Unicode categories)
Overall, I'd say that I like the idea and what I've read in the docs :) Good luck!
Postman generates data based on datatypes in the OpenAPI spec: strings, numbers, booleans, etc - but the data will not look realistic. The video outlined a rudimentary test that checked if required fields were present.
Our TestGen feature generates realistic-looking data, such as dates, names, addresses, URLs, etc, automatically based on the field names, examples, and other API spec metadata. It does this automatically, without human intervention. The output is JavaScript, so if further customization is needed, such as using a response value of an API call in a subsequent request, you can do that.
From my understanding, Schemathesis can generate data based on a value being a string, number, boolean, etc. It also seems fairly manual to set up and has a learning curve. Our output is JavaScript that can be run anywhere.
With our TestGen feature, the AI looks at example requests in a HAR file or Swagger examples, or it can solely rely on the name of the property. From there, it automatically generates the correct type and format of data - e.g., if a field is named "address," it generates a value that looks like an address and is formatted in the same way as examples. It wouldn't be practical to cover every potential edge case and scenario without AI.
Also, an important note would be that Schemathesis is a property-based testing tool, which does not necessarily imply load testing. I.e. the comparison might not be that helpful as tools have different goals.
However, Schemathesis can use targeted property-based testing to guide the input generation to values more likely to cause slow responses, i.e. it can maximize the response time and at the end, the user can discover that passing `limit=100000000` will read the whole DB table and cause a response timeout (which is a trivial example though)
Schemathesis author here. I hope to clarify a few points here
> From my understanding, Schemathesis can generate data based on a value being a string, number, boolean, etc
Schemathesis can generate data that matches the spec or not based on the config option (specifically meaning JSON Schema based validation) including all the formats (e.g. date, etc) defined by the Open API spec. For GraphQL it supports all built-in scalar types + a handful of popular ones like DateTime or IP. With extra configuration can also generate syntactically invalid data (e.g. invalid JSON). Serialization is a different step - the payloads can be serialized to JSON or XML, YAML, etc. In my private extension, I also use a Python version of `faker` to mix more realistic data into the set.
> It also seems fairly manual to set up and has a learning curve. Our output is JavaScript that can be run anywhere.
The simplest one-off run is `st run <SCHEMA>`, and it is not clear to me what you mean by being fairly manual to set up. If the user already has a schema (or derived it from traffic / generated by a framework, etc), the only thing they need is to invoke the CLI. Surely there are many config options for different scenarios, and one may take more effort to configure than the other.
Everything has a learning curve - more interesting aspects would be whether this learning curve is justifiable and how often the user needs to dive deep into configuration. My aim with Schemathesis is that in 90% its defaults should be enough for most of the users, for the rest 10% there should be as few barriers as possible for the user to accomplish their goal (which often generates data that has a higher probability to uncover defects).
> From there, it automatically generates the correct type and format of data - e.g., if a field is named "address," it generates a value that looks like an address and is formatted in the same way as examples. It wouldn't be practical to cover every potential edge case and scenario without AI.
From the point of view of coverage of the edge cases, the description sounds like a happy-path scenario. What about the deviations?
Also, most fuzzers do a pretty good job in terms of covering edge cases without AI, especially greybox ones. What would be the concrete AI contribution here? Or what is the core difference in covering with AI or without it?
That’s a great question. The TestGen feature generates JavaScript that uses faker-js functions to generate the test data, and axios to make http requests. You can copy and paste the JS output of TestGen and run it anywhere.
Hey, this is the author here. Thanks so much for sharing the post. We learned a ton about the nuances and limitations of prompting while building this feature. What are everyone's favorite AI features or integrations that they've seen?
Can you add a checkbox for lossy vs lossless compression? If you are doing lossy compression on PNGs, might as well change it to a non-png file. FYI, there are great lossless PNG compression algorithms out there - PNGcrush, oxipng, etc. ImageOptim has a bunch built-in.
Good point on the two Reacts. Call me old-fashioned, but I still prefer being able to serve a React app over a CDN instead of needing a Node.js server. Might be in the minority these days though.
The CLI is slated for end of Q4/early Q1. It will be able to run load tests, get results, manage your team, and most of the other functionality you get in our web app.
One major update included with the CLI is a code repo integration - you can have your load test scripts in your codebase and track results to commits.
We built Jolt to solve a fundamental problem with AI coding tools - they struggle with larger, real-world codebases and cannot accurately determine the context files for your prompt.
Jolt's public beta is live today, and we'd love your feedback. You can use Jolt on the web or in any VSCode-based IDE. Our free tier will let you work on codebases up to ~100K lines.
We developed a novel way for Jolt to understand large codebases and automatically determine context files. Our approach scales to multi-million line codebases - the largest codebase using us in production is over 8M lines.
Jolt is optimized for codebase understanding rather than response speed. That said, using Jolt is significantly faster than other tools when you include the time spent selecting context files.
Manually selecting context files is a non-starter when you're working on larger codebases or unfamiliar code. It's a broken product experience. Most AI coding tools rely on some flavor of vector-embedding RAG to determine the files related to your prompt. The reality is that vector-embedding search is not effective on code, and there is a sharp drop in efficacy as codebase size increases.
Here's what folks are using Jolt for: - Writing code, tests, and refactoring - Onboarding developers - Brainstorming a feature's implementation - Asking questions about OSS repos - Writing documentation, including mermaid charts - Contributing across the stack
Thanks for reading, and let us know what you think.
reply