Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Donobu – Mac App for Web Automation and Testing (donobu.com)
134 points by wewtyflakes 55 days ago | hide | past | favorite | 35 comments
Been working on a desktop app for Mac that lets you create web flows and rerun them (https://www.donobu.com/).

You can optionally use AI (BYOK: bring your own keys) to create flows for you and to do other interesting things, like making vision-based semantic assertions. Also, your data lives on your own filesystem, and we do not see any of it (further still, there is no phoning home at all). A nice benefit of this being a desktop app rather than a SAAS product, is that if you happen to be developing/iterating on a webpage locally, this has no problem hooking into it.

What this intends to be a good fit for: - Testing web pages, especially locally. - Exploring random webpages with a stated objective. - Automating tedious flows. Rerunning a flow won't get caught up on using a single selector (many websites randomize element IDs, for instance), there is smart failover using a prioritized list of selectors. - Getting a quick draft of an end-to-end test in Javascript.

What this is a bad fit for: - Mass web scraping (too slow). - Adversarial websites.

What we are still working out: - Click-and-drag operations. - Websites that are primarily controlled from canvas. - Smoothing out UI/UX (we are two backend engineers trying our best, and are handedly outgunned by real frontend engineers).

Fun things to try: - Asking it to assert that a webpage has a certain theme. - Asking it to run an accessibility report for a page (uses https://github.com/dequelabs/axe-core). - Asking it to run a cookie report for a page.

The tech: - Java 21 for the main business logic. - Javalin 6 for the web framework (https://javalin.io/). - Playwright for controlling the browser (https://playwright.dev/java/). - Axe for running accessibility reports (https://github.com/dequelabs/axe-core).

Critical feedback is welcome. Thanks for trying it out!

Cheers, -Justin and Vaz




Such a nice, simple landing page. I could immediately understand and appreciate what this app does by first looking at the picture where the flow is defined, then seeing the flow executed on video.

It's like the old Apple commercial: "There is no step three."

Congratulations on the launch!


I thought the same, so many Show HN landing pages leave me wondering what it is the product does. Not this one, clear and concise.


Well, i couldn't understand what this show hn is about after spending two minutes on their website and closing it. It's not a good landing page by definition if i need more than one minute to grasp what exactly you're offering.


Hello, I will be using your app soon. I was looking for something playwright related for the website I am making. I am so glad that the price is not "contact us".

I noticed in your video that Donobu does not move the mouse to the search box before typing in it. I hope this does not trigger captchas or anti-bot protection--I was thinking of adding "Firebase App check" to my website since Firebase recommends it to everyone and it uses "Recaptcha Enterprise". not sure if this will turn my website into an "adverserial website".

I think Donobu would also be a lot more helpful on mobile since there are more phones than desktops in general. I was looking for some kind of automated mobile testing and found none. quickest way I can think of to add that is using the new iOS 18 with desktop control of the phone.

I think you could "easily" translate this to arbitrary desktop control test software. or make some other agent software that does. if you don't someone else will https://youtube.com/clip/UgkxcFMelp1K31l7pH0Sbghb4sJ-eF0O8-x... If you made the desktop control software, I think you would get mobile control software for free.


We make some attempts to be human, though dedicated bot checks may well find Donobu. The things that work in Donobu's favor is that it is intended to be run locally, and so requests coming from it do not come from some giant Amazon or Google data center IP address range. Also, Donobu is not optimized for mass web scraping (it has deliberate delays included), so it does not tend to get rate limited. That all being said, there is some low hanging fruit we will be working on in order to be generally less bot-ish.


Playwright (and axe) is a good option, but how do I have confidence it's performing the test correctly and repeatably? If the test seems flaky how do I know it's the software and not randomness of the AI part? I want tests to be predictable.


For running the axe test itself, once the agent has decided to run it, is a dedicated tool to run an off-the-shelf axe script. The axe script itself does not change from run to run, so assuming you are running the test on the same page, you should get the same result.


The axe element will be the same each time, but you can do that without any AI shenanigans - just run it in a normal Playwright test.

My question was really about the page interactions and the assertions being driven by AI: if they are going to be generating different code every time the test runs, how can you have any confidence in the test not having false positives and false negatives at least some of the time, unless you read the generated script each time?

That sounds like a lot more work than just writing the test once in the traditional way (codegen or manually) and tweaking it only when there's a breaking change to the page.

If people are genuinely using this approach then there must be something I'm missing.


We are going to be adding the ability to trigger more actions (beyond the normal clicks/keyboard) without AI by using the in-browser control panel. We wanted to add it for this ShowHN, but we ran out of time on our self-imposed deadline. :(

Regarding variability of flows, you can cement a given flow by pressing the `rerun` button. That takes AI out of the driver's seat and the flow will rerun the set of actions decided on in the original flow as if it's on rails.

Regarding creating a test manually, that will be a best fit for pages that have consistent selector logic for elements, though we found that as soon as a page starts randomizing element IDs, this approach starts to struggle. We get around this by creating a prioritized list of selectors for every action that touches the DOM, so that if `document.querySelector("#shenanigansId")` fails, the run can still continue by choosing the next-best selector, and so on. Thankfully this logic requires no AI at all, though it is heuristic at the end of the day.


Thanks for the detailed response.

The Playwright dev team would probably say that you should avoid using IDs as selectors and instead favour the use of selectors based on user-visible aspects, e.g. "a link including the text 'cat'" or "a button with the label 'register now'". That way your tests are immune to under-the-hood changes the user is oblivious to. The range of selector options (locators in their world) are a real strength of Playwright.


Nice work! Can you add support for OpenRouter? Then it's easier to switch models too.

Edit; it's actually really impressive; I try them all every so many months but this one seems to be the first one that actually works.


Looks good. I can understand the few UX quirks, as this is an early version. Unfortunately, I keep getting an "Internal error ".

https://www.dropbox.com/scl/fi/8npc2rppe0soyz15dmbhk/screens...


Oh shoot, sorry to hear that; there might be interesting logs in "~/Library/Application Support/Donobu/app.log"


This looks really interesting and could really be a nice addition to my daily work.

I just downloaded the application, but are unable add OpenAI API keys. Looks like it's probably on my end (with quite an aggressive DNS blocking lists). So my guess here is: I'm unable to add API keys when telemetry is blocked.

Suggestion: please do add some error message when then this occurs. As in, did the request fail (500), faulty key etc


Thank you for the direct actionable feedback, we will improve that messaging.

Regarding debugging your specific problem, when an API is attempted to be added, the local process attempts a 1-token request to the cheapest model with the GPT platform (in your case, gpt-4o-mini on OpenAI) to verify that the key works. Though, if the account has no balance, this request may fail even though it costs a fraction of a fraction of a penny (though anything that fails that request will cause the API key to be considered invalid).


You can also request the model list to check the validity of a key. No tokens needed.


Any support for locally installed LLMs?


Not yet, but we are looking into supporting Llama. Would love to support local LLMs so we can say the app is entirely local.


Tried it out — nice work guys! Will be using for our E2E testing.


Just FYI I signed up and your "Confirm Your Signup" email went into our junk mail (o365)

e: and the magic link doesnt work: Sorry, we could not authenticate you. Please try again.


I ran into this issue before when building a magic link system. Seems Microsoft visits the link to check for risks, which in turn nullifies that link before it hits your inbox. Fun times.


Interesting! How far away is this tool from being a desktop simple RPA solution? Would it be able to interact with desktop apps and simulate mouse or keyboard actions in the future?


Right now we are constrained to using the DOM of webpages. This is because we need some reasonable way to detect what parts of a UI are interactable or not (though we do some magic to do this); this would be a bigger challenge for arbitrary desktop apps. Also, there would be new security aspects that would have to be meticulously considered if we were to allow an agent to control arbitrary desktop apps. Constraining the agent to the browser has nice alignment in this way.


Is there anything in the tech stack that make this app specific to Macs, or are you simply rolling this out to Macs, first?


We're rolling out to Macs first since that is what our development environment has been, though nothing is stopping us from supporting Linux or Windows once we have those environments properly set up. It is on the near-term road map!


This looks like a promising start.

Any near term plans for exploring the use of local LLMs?


Yes, looking into supporting Llama!


Just to be clear, did you really decide to use Java to build a desktop application for Mac? I see you mentioned Java 21 as main business logic layer, which technology did you use to build the desktop application?


We really did! :) Though, there were some unfun challenges around that, like getting distribution to work without having to get people to go through the pains of installing the JDK. Thankfully, since our UI is just a web browser, we did not have to go down the path of JavaFX or anything like that; our UI is just plain JS/HTML making API requests to a propped up server on localhost:31000 (for the curious).


nice work, can you export the code later for integration with our CI/CD pipeline?


I see a button to export to JavaScript code. I don't know if they support direct integration with CI/CD.


There is an API at localhost:31000 and there is a hidden one-shot mode to rerun a flow if you happen to work the command line just right, so it is technically feasible to integrate with CI/CD, though all that needs proper docs before we would expect anyone to reasonably do that. It is on our roadmap though, and the lift is not high.

Regarding exporting to Javascript, seems you found the button for it. Though we like to call it a "draft of a script", as it is generated by an eager-to-please LLM, so having a real engineer give it a look over is useful. It should be enough to get you off the ground though.


Are you planning to make that into a server blob we can run on a non-mac server? CI/CD is 'kind of' important!


Yes, internally we have an Ubuntu build working but it has not been publicly released yet (its release is on the near-term roadmap). If you have an immediate need, please reach out, and we can work it out.


I will use it immediately as soon as windows users can use the app and then export to CI/CD/code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: