Hacker News new | past | comments | ask | show | jobs | submit login
Playwright: Automate Chromium, WebKit and Firefox (github.com/microsoft)
383 points by cyrusmg 3 months ago | hide | past | favorite | 141 comments



Interesting tidbit:

One of the main contributors of this project[0], was the core contributor (creator?) of Puppeteer[1], but then I guess left Google to join Microsoft and work on this[2][3].

[0] - https://github.com/aslushnikov

[1] - https://github.com/puppeteer/puppeteer/

[2] - https://github.com/microsoft/playwright/graphs/contributors

[3] - https://github.com/microsoft/playwright/graphs/contributors


I wonder what the story is there. Why wouldn't MS just have him continue to work on Puppeteer? They're both open source, so there's not much point in "owning" their own clone of it.


I work at Google, but this is based on stuff I knew before I worked at Google, which I heard from a coworker. I haven't checked on the project since joining. The Puppeteer TL (the guy linked in the grandparent comment) apparently had ambitions to make Puppeteer work cross browser like Playwright does now. However, the Puppeteer project was heavily deprioritized and the TL would basically never be able to achieve their vision. This made it pretty easy for Microsoft to basically take the entire Puppeteer team from Google. That coworker of mine told me also that after all those Puppeteer devs left, Puppeteer is now basically only a 20% project worked on by a few people. The number of open issues/PRs done kinda reflects that (no idea of this is still true or not)


I manage the team at Google that currently owns the Puppeteer project.

The previous team that developed Puppeteer indeed moved to Microsoft and have since started Playwright.

While it is true that staffing is tight (isn't it always), the number of open issues does not tell the full story. The team has been busy with addressing technical debt that we inherited (testing, architecture, migrating to Typescript, etc) as well as investing in a standardized foundation to allow Puppeteer to work cross-browser in the future. This differs from the Playwright team's approach of shipping patched browser binaries.


> The team has been busy with addressing technical debt that we inherited [...] migrating to Typescript

Wow, not writing stuff in TypeScript is now considered technical debt? I knew people were already rushing to rewrite everything in TypeScript if they could, but didn't knew we'd come this far along the hype-cycle already.


Yes definitely. I've worked at two companies in three years spanning 250,000 employees and both companies consider writing JavaScript deprecated in favor of typescript.


Perhaps because Typescript is a Microsoft baby?


GP manages puppeteer team at Google


I used Puppeteer on a project recently to generate some really big and complex PDFs that would have been a massive pain to do any other way, so thanks for your work, and I'm very happy to hear that the project isn't dead.


Glad to hear that. Puppeteer still has a number of compelling things over Playwright (like not shipping patched binaries) so I hope competition in this space can continue to happen :)


> This differs from the Playwright team's approach of shipping patched browser binaries.

Can you expand on that?


"Each version of Playwright needs specific versions of browser binaries to operate." [0]

They patch and compile browser binaries so they have the functionality Playwright needs.

Their build of Chromium is one release ahead of what's out but it looks like one could maintain a library of older Playwright browser binaries to test with. They probably have an older Firefox 91 binary that's feature-equivalent to the current Firefox ESR. Their WebKit builds won't ever be exactly the same as Apple Safari.

[0] https://playwright.dev/docs/browsers


This is also my understanding.


From what I know puppeter works only with chromium, this could be deal breaker for microsoft


Puppeteer has experimental Firefox support.

https://pptr.dev/#faq:~:text=What%20is%20the%20status%20of%2...


I think that has been the case for a very long time.


Edge is chromium though.


Microsoft's web products run on more than Edge.


They try very hard to stop you trying, though; Microsoft Teams has removed the “continue without the Teams app” button entirely on Firefox, even though I'm fairly sure it still works fine in Firefox.


I'm assuming you're not on macOS, right? Even though Firefox supports H.264, it's not really consistent outside of macOS because Firefox doesn't want to deal with patent hassles. This is true of Slack as well, some developers would actually support if Firefox has actually have decent support, but it's inconsistent outside macOS, where it can rely on Apple's good integration.

P.S. OpeH264 doesn't help mainly because it's decoding bugs, not encoding bugs.


> Even though Firefox supports H.264, it's not really consistent outside of macOS because Firefox doesn't want to deal with patent hassles.

That's only a concern for video chats. Teams does a lot of other stuff too, and there's no reason you shouldn't be able to use that on Firefox.


You get the same message from macOS Firefox.


I'm exaggerating but Playwright vs Puppeteer is a bit like comparing Puppeteer with Selenium


Not at all true


I don't really have a horse in this race, but based on the thread, why would this be wrong. Are any of these automation frameworks really different? What would be the differences? When would I use one versus the others? Especially why would I use puppeteer if it is a dead project as implied?


Puppeteer and Playwright did a lot for the industry, but we feel the times have changed. We develop software very quickly, and it's changing a lot, which means we need a tool that can handle the changing era. In DevOps, QA should be able to develop and maintain end-to-end tests quickly. Shift left testing is all about that. Programming tests are very costly, time-consuming, and exhausting. There is a reason why only up to 6% of us do QA/testing (according to the StackOverflow survey https://insights.stackoverflow.com/survery/2021#developer-ro...) in the long term is boring and repetitive. These are reasons to create something new, and that's why we built https://bugbug.io. I'm very open to your opinions and ideas!


What do people thing of playwright vs cypress? I've been considering using playwright instead as it supports more browsers and I feel like it's easier to do production monitoring (by putting it in a aws lambda or using checkly)

- Cypress: https://www.cypress.io

- Playwright aws lambda: https://github.com/PauloGoncalvesBH/running-playwright-on-aw...

- Checkly: https://www.checklyhq.com


My main issue with Cypress was the fact that it required a shift in mental model, i.e. queuing commands rather than executing them, thenables that aren't actually promises, running as client side JS as opposed to in a controller script. There were a ton of sharp edges and required a lot of education to adopt.

Meanwhile Playwright's API is just promises. Everybody knows how to compose promises.

The Cypress approach does in theory completely eliminate latency-induced flakiness, but in my experience, Playwright/Puppeteer's usage of bi-directional real-time protocols to communicate with the browser makes latency low enough that latency-induced flakiness is no longer a practical concern, especially when paired with the ability to selectively run test logic in client-side JS in the page with 0 latency.

Selenium did suffer from latency-induced flakes all the time due to its slow uni-directional request/response model. I personally believe the Cypress model is an over-correction for Selenium PTSD, and isn't making as good a set of tradeoffs compared to Playwright/Puppeteer.


The irony is that the Cypress model is extremely similar to the Selenium Version 1 model. We abandoned that approach in Selenium V2 (WebDriver). Time is a flat circle.


This. Similarly how cypress does clicking with JavaScript which leads to weird bugs rather than with the debugger - similarly to old versions of selenium


I am leading a transition at work from Selenium to Cypress precisely because of our frustration with the WebDriver model. I don't consider myself an expert enough to know which approach is right, but could you elaborate on why Selenium chose to divert in V2 to what it is today?


We see a strong adoption of Playwright. It’s the default we recommend to users now. We also support Puppeteer, but its development is lagging.

Having said that, I would love to support Cypress if I had infinite time and focus.

Side note: Selenium is not on the menu, even with its large install base.

We are aiming for where the puck is going and it’s going to Playwright.

Full disclaimer: I’m CTO at Checkly.


The checkly folk recently published this

https://blog.checklyhq.com/cypress-vs-selenium-vs-playwright...

Cypress vs Selenium vs Playwright vs Puppeteer speed comparison


I saw puppeteer is actively developed, wonder where the 'lagging' implies.

doesn't playwright use puppeteer for chromium-based browsers(even edge-based), I thought it's just a wrapper for puppeteer with extra support for firefox.


How about Python bindings? Selenium is the only decent option in that space so far, are we wrong?


looks like playwright is bridging this gap too:

https://playwright.dev/python/


Playwright Python is excellent


Any insights on the flakiness of Playwright vs other tools you’ve used?


Why was Selenium off the menu?


Selenium is slow and based on webdriver. I think most consider it legacy at this point. Most new projects tend to use Playwright, Cypress, or Puppeteer for E2E tests. All three options are much more performant and reliable.


I've been out of this space for a while, but isn't WebDriver the only cross-browser spec for browser automation? The last I knew, the browser vendors were removing other automation hooks for security reasons and heavily influenced the WebDriver spec.


Well, playwright is cross browser too.

I've only been using/playing with it for a few weeks, but I find it more contemporary and less clunky than selenium. And it's lighter and runs faster. I dislike passing around a driver to tests and the playwright built-in testing (which I think is wrapping the expect library) is pretty nice.

I haven't fully made peace with the selector API in playwright, which uses a lot of experimental css selectors and a custom DSL, but these are minor learning curve issues.


I'm sorry if not clear, but I was asking about the WebDriver spec [1]. The terminology gets confusing because WebDriver was a tool that then merged with Selenium and Selenium implemented the WebDriver wire protocol. If Playwright isn't using WebDriver, is there a new browser automation API?

Edit: It looks like it's using Chrome's DevTools protocol. I'll have to read up more on that. I thought Google deliberately didn't want that to be used for automation, but I'm probably thinking of something else.

[1] -- https://www.w3.org/TR/webdriver2/


I'm not sure if it's using (or can use) CDP for all automation or for chromium, but CDP became my go to for headless browser work in the last six months, it's bypassing the abstraction of puppeteer and has a lot more language support.

But it's not itself a test framework, nor is puppeteer, so it's nice having a more modern toolset on top.


fwiw, Selenium (in v4) now supports CDP. From the website:

"WebDriver Bidi is the next generation of the W3C WebDriver protocol and aims to provide a stable API implemented by all browsers, but it’s not yet complete. Until it is, Selenium provides access to the CDP for those browsers that implement it (such as Google Chrome, or Microsoft Edge, and Firefox), allowing you to enhance your tests in interesting ways."

https://www.selenium.dev/documentation/webdriver/bidirection...


WebDriver through WebdriverIO is actually really solid and I think more teams should give it a try. My understanding is you don’t need any Selenium tooling if you use it but can interface with projects like chromedriver directly.

It’s faster than cypress and you have access to a massive ecosystem as well of things that speak the WebDriver protocol, as well as the ability to do hybrid and native mobile testing (through Appium).

I’ve been really happy with it and we’ve decided to go all-in on it for apps in the Ionic ecosystem.


Cypress was such a pain that I simply wasn't writing or running tests just to avoid it. It was incredibly slow, had major bugs that are till open after multiple years, and worst of all, in headless mode it was passing tests that should have been failing (and were failing in headed mode). The only reason I discovered it was because headless mode seemed to run unusually fast.

I switched to Playwright 6 months ago - I'm much happier now and the switch also allowed me to delete 3/4 of the helper code that I needed before.


On my team we evaluated Cypress and Playwright and landed on Playwright. Some features it has that Cypress didn’t was Safari support, support for cross domain tests, and support for tests that need to open multiple browser windows (for testing collaborative editing).


I love Cypress. there are definitely limitations though like iframe support or visiting two separate domains, tab support

https://docs.cypress.io/guides/references/trade-offs#Permane...


- Cypress does not run on M1 natively.

- Playwright is more lightweight. Can be good or bad on what you expect. But I definitely prefer Playwright.


Cypress was terrible choice for us where flow in the system is managed by multiple users - as cypress can work only on single open window/tab, you had to effectively duplicate, triple etc. each test. Playwright also allows trivially control/crosscheck backend/whatever as you're in nodejs context, not browser context.


Playwright is definitely better IMO. Cypress is overengineered.


I wouldn’t say cypress is over engineered. Just a byproduct of its time.


What do you mean?


I highly recommend codeceptjs. After 4 years of using test cafe for e2e testing, codecept has proven to be much more pleasant to use.


There is also gauge/taiko https://taiko.dev/


We recently picked playwright for a project because we could do more with the "mouse". With playwight you can click coordinates on the page. I did not find an easy way to do this using cypress. Aside from that, playwright seems to be so much faster.


Electron appear to have dropped support for their previous automated testing framework Spectron - https://github.com/electron-userland/spectron/blob/master/RE... - and now suggest Playwright as an alternative: https://www.electronjs.org/docs/latest/tutorial/automated-te... and https://playwright.dev/docs/api/class-electronapplication/


WebDriver or Playwright. I switched from Spectron to selenium-webdriver.


Anyone have experience with Playwright compared to Selenium? I have a fairly large test suite and Selenium produces constant false positive errors, typically due to various timeouts that seem fundamentally unsolvable when running it from .NET. It's just very finicky.

I don't know if it's Selenium specifically or some problem with the .NET binding, but I figure Microsoft must have better .NET integration so it will at least eliminate that possible source of problems.


I'm not sure if any of these are pertinent to your tests, but these are the issues I see most often that cause flaky tests:

- Hard-coded waits in your code, like "Thread.sleep(1000)". A better alternative is to replace hard-coded waits with something that waits for an element or value to appear on the page. i.e. click on a button and wait for a 'Success' message to appear. Puppeteer and Playwright both have good constructs for doing this.

- Needless complexity in the tests. Conditionals in particular are a code-smell and indicate there's something needlessly complex about the test.

- No test data management strategy. The more assumptions you can make about the state of your application, the simpler your tests become. Ideally tests are running in an environment that nothing else is touching, and you're seeding data into that environment before tests run. I personally don't believe in mocking data in regression tests since that quickly becomes hard to manage.

We spend a lot of time thinking about these issues at my company and wrote a guide that covers other common regression testing issues in more detail here: https://reflect.run/regression-testing-guide/


> - Hard-coded waits in your code, like "Thread.sleep(1000)". A better alternative is to replace hard-coded waits with something that waits for an element or value to appear on the page.

We don't do any timed waits, all of our waits are for an element or value to appear, but these waits never complete sometimes, non-deterministically. We then added a long 5 min timeout on these waits because we know the test will never complete at that point. It's always fine in manual testing though, and if we don't run the browser in headless mode and watch it work. Very frustrating.

Sometimes the HTTP requests themselves timeout after a few minutes, but this never happens in manual testing either. That's actually the most common issue these days, and this happens non-deterministically too. This is what I meant by "flaky".


I wonder if the infrastructure that's driving the browser tests is underpowered. If the browser process is dying silently or CPU is getting maxed out, it could manifest in what you're describing where it happens intermittently and there's very little to go on. I'm assuming you're running these tests in Chrome... You could check the Chrome debug logs to see if anything is being spit out there.


Does Selenium have a trace log generator that will dump out all events? I.e. all element creation on the page, matching, etc.

I'm not familiar with it specifically, but that's my go-to starting place in weird automation issues like that. Normally it gives some kind of hint as to why that's happening (or why Selenium thinks it's happening).


While I have not used Playwright (but have a lot of experience with Se), I would say the code style is refreshing:

    // Expect an element "to be visible".
    await expect(page.locator('text=Learn more').first()).toBeVisible();
Writing await for every action makes the timeout of the action seem more explicitly declared. There seems to be more granular control of timeouts as well https://playwright.dev/docs/test-timeouts

> I don't know if it's Selenium specifically or some problem with the .NET binding

If the execution in .NET is slow then I suppose it could be .NET. But it could be (and often is) the suite design. You must wait for /everything/ before interacting with it because the code execution is quicker than the page.

Large Se/Webdriver suites are often a PIA. I find it's nice to write them with Python or Ruby so they can be debugged interactively with the an interactive shell.


> If the execution in .NET is slow then I suppose it could be .NET. But it could be (and often is) the suite design. You must wait for /everything/ before interacting with it

That's what I do, but the wait for an element in certain tests times out after a few minutes, even though the elements are clearly visible, and manual use never has an issue.

From other comments it sounds like Puppeteer and Playwright are better on this, so will look into switching.


When I fixed many similar selenium/webdriver tests the root cause was always the same: You grab reference to an element and for example wait it to become enabled or some text to appear. But your ui framework actually replaces the element in the dom while doing its thing and your reference to stale element will never change. Fix is to loop searching the element with selector and check if the element fills the conditions. If not, retry from search again. We had nice helpers for those and had very stable selenium tests.


> Fix is to loop searching the element with selector and check if the element fills the conditions. If not, retry from search again. We had nice helpers for those and had very stable selenium tests.

Thanks, I'll double check, but I think we do this now. In looking at the history of test failures, those failures are indeed less common, but still plenty of false positives of other types. Most persistent recent failures are the WebDriver timing out when loading a URL, which has never happened while manual testing or when being used by end users, so not sure what's going on there.

In any case, if the Playwright API encourages better idioms for writing tests that avoids these pitfalls, that would be cool because I deal with a lot of work term students that aren't adept at this kind of stuff so that would save a lot of headaches.


In my experience an other common issue is a race condition in the trigger, aka the system is not quite settled yet when an interaction is performed leading to that not registering.

This is more likely when the system is loaded / shared e.g. a CI Vm.

It’s instructive to watch a screencast / recording of UI tests, because you don’t necessary intuit how spazzy and fast the harness will perform its interactions.


I have had similar issues with selenium via other languages too - it is generally pretty flaky. E.g. saying a button or some other element doesn't exist when it clearly does.

With great care and effort you can make your tests reliable (especially if you are happy to allow a "best of 3" type test strategy to allow for 1 flake and 2 passes) though. Prodigious use of the wait (i.e. stdlib polling) primitives seems to give you the most bang for your buck.

I am note sure if this is just the nature of web automation, or if selenium is just crap? My gut is to say it is selenium's fault since we never get the same issues when using javascript in the DOM or in an extension)...maybe it is the browser APIs I guess? U have no idea but if this playwright is any better than that would be superb.


I think the issue here is Selenium or Playwright they depend on Selectors which depends on UI. And when there is a change it breaks the tests. We are working on something to generate you an adapting code (Cypress first) and let you know when there needs to be a change in your test script.

We have an AI model that understands the page structure as humans do. So we can do this "Click on 'Sign in' on the 'Login' page".

We have a no-code tool as well which adapts to the changes. But we want to generate the code for people who want to keep things internally.

Would love to discuss it more: m@preflight.com Our website: https://preflight.com


On paper Playwright should be a LOT better - it's taken a similar approach to Cypress, where everything is designed around the need to reduce flaky tests.

In Playwright that manifests itself as the "auto-wait" feature: https://playwright.dev/docs/actionability

You can do this kind of thing with Selenium too but it wasn't designed in from the very start of that project.


I think it's possible to write tests in selinium which are time-independant... Eg. "Wait for element #foo to exist".

You can also give the browser a virtual clock so that you can use time based timeouts and give every test a timeout of 3 hours, but those 3 hours only take milliseconds in real time. That approach gets CPU expensive if your site has any background polling scripts or animation, because obviously the animation will end up animating a lot during the test!


> I think it's possible to write tests in selinium which are time-independant... Eg. "Wait for element #foo to exist".

Yes, this is what I've done but the elements non-deterministically do or do not appear according to selenium, and then the wait times out. This happens for dozens of tests across dozens of pages with no issue with manual use, so something fishy is going on.


I tried selenium then playwright for a .Net project, selenium wasn't easy to work with. Playwright was good but for some reason which I don't recall exactly (could have been because it had to redownload chromium everytime we deployed). I ended up switching to puppeteer and I ended up very happy with it.


You can now generate puppeteer code from Google Chrome Recorder. You should check it out. But still it might be flaky.

No-code is the best in my opinion :D https://preflight.com

We have done all the ground work. Like: - Concurrency - Adapt to the changes. Our selectors are like this: "Click on 'Login' button in the 'Sign in' form" - Update the tests with an HTML/Video player etc


Benefits of Playwright over Puppeteer - official support for languages outside of JavaScript, and official codegen/record support. Great!


Also testing on safari/iphone is easy. It also has built-in snapshot testing. I just wish it was integrated with S3 as git LFS is not good


Reposting my previous notes on Playwright (https://news.ycombinator.com/item?id=30060135):

I just want to plug Playwright by Microsoft as I've been using it over the past month and have had a really great experience with it: https://playwright.dev It's built by the founders of Puppeteer which came out of the Chrome team. Some things I like about it:

1. It's reliable and implements auto-waiting as described in the article. You can use modern async/await syntax and it ensures elements are a) attached to the DOM, visible, stable (not animating), can receive events, and are enabled: https://playwright.dev/docs/actionability

2. It's fast — It creates multiple processes and runs tests in parallel, unlike e.g. Cypress.

3. It's cross-browser — supports Chrome, Safari, and Firefox out-of-the-box. 4. The tracing tools are incredible, you can step through the entire test execution and get a live DOM that you can inspect with your browser's existing developer tools, see all console.logs, etc...

5. The developers and community are incredibly responsive. This is one of the biggest ones — issues are quickly responded to and addressed often by the founders, pull requests are welcomed and Slack is highly active and respectful.

My prior experience with end-to-end tests was that they were highly buggy and unreliable and so Playwright was a welcome surprise and inspired me to fully test all the variations of our checkout flow.


Having worked with these folks back in Chrome, it's been great seeing this project continue to be successful. Great job!


How hard is it for a site (server side or via JavaScript on the page) to tell that it is being accessed via a browser that is being automated with Playwright?

I've seen some sites that behave differently when the browser is being automated. E.g., if I access fanfiction.net from a browser being automated with Selenium it gets stuck in an endless Cloudflare CAPTCHA loop. Accordingly I've come to prefer automation methods that are less revealing to the site.


Playwright is great, especially if you are dealing with test cases that span multiple domains/contexts. I had to test some user flows which involved logging into two apps, each with three different users to perform and validate various actions. Playwright's context switching made it a breeze. Also, it offers a nice separation of browser automation and test runner API, so it can be used outside of E2E testing as well.


I’m working on a project that provides remote browsers, running on VMs/containers, capable of running Playwright tests (and Puppeteer scripts): https://headlesstesting.com/

We’ve seen a consistent growth of interest in people wanting to use Playwright for browser automation (and testing).


For generating PDFs like invoices in a webapp, is libraries like this the way to go these days or is still using a pdf lib the norm?

Pros of Playwright/Puppeteer:

Reuse existing HTML/CSS knowledge

Cons: Requires an external service or shelling out to an external process

Pros of using a pdf lib:

Probably better performance, simpler architecture by being in-process.

Cons: Ad-hoc language for designing the PDF.


It's very simple to use either, there are loads of example implementations on GitHub.

I used one based on docker, and the bottleneck was actually sending the html, css you want to print (if it's not already served over http). I used a shared docker volume to write to from one process (python) and read from another (the node pupetter).

It all comes down to, load html, wait to load, save to pdf. Very simple, fast, and reliable. More so than weasyprint for example.


Playwright is a great tool. I was able to create a proof-of-concept stock screening tool using automation & screenshots of HTML elements to help me get swing trading ideas each morning/night when the market closed. It's a .NET Core console app using a CLI library called spectre.console based on rich(python) and playwright as the workhorse.

There's so much potential to use playwright in CI/CD with GitHub Actions cron jobs. Really enjoying it so far.


The only thing I wish we had was remote browser access - so I could run my tests on a VM (like within a docker image) and use a browser on the host.

We use TestCafe at work for this purpose. I personally hate TestCafe as it's is an absurd unfocused mess of a browser remote, but it lets me control my browser by navigating to a URL which no other browser remote system does.


You can do that with X11, set the display of your desktop inside the docker container and playwright's browser will appear on your desktop.


What is the … operator for?

    test.use({
      ...devices['iPhone 13 Pro'],
      locale: 'en-US',
      geolocation: { longitude: 12.492507, latitude: 41.889938 },
      permissions: ['geolocation'],
    })


That is the JavaScript spread operator. It takes all of the elements of an object or an array and adds them to another. In this case, the code posted is merging the iphone 13 pro device settings object into an object literal


Most cool, thx


Looking at the source, I wonder why lots of the files have Google copyright?

https://github.com/microsoft/playwright/blob/0d277fa589e9508...

edit: ah puppeteer was a Google project. I forgot


Are there any products for QA folks that reduce the workload? I find most things are still done manually…


Yes definitely, there's lots of products in the QA space trying to tackle the problem you're describing. I'm a co-founder of a no-code product in the space (https://reflect.run). Being no-code has the advantage of enabling all QA testers to build test automation, regardless of coding experience.


We are also in the space. Manual testing is definitely time consuming. You can automate your manual testing with https://preflight.com

Would love to help any of your testing needs


Anyone tried BotCity?

https://botcity.dev


There is also this Github Discussions thread on how it compares with things like Cypress, which is a E2E testing tool used by lots of people on the frontend web. TLDR: Playwright can achieve nearly all the things cypress can & more due to it being a fully scriptable browser - https://github.com/microsoft/playwright/discussions/11201 - https://cathalmacdonnacha.com/cypress-vs-playwright-which-is... - https://alisterbscott.com/2021/10/27/five-reasons-why-playwr...


Does it support screencast - video recording of the browser with audio?


It supports video recording (without audio), screenshots, and post mortem recording which is called Tracing.


I recognise your name from playwright! Thanks for the product, I love it.

All of the above are uploaded automatically as Github artefact as part of auto-generated Github Action!


Is there an open source web testing tool which also integrates a dashboard, keeps track of test runs, creates reports, something that I can just install on a vm and run to test a web app?



playwright html reporter kinda does that, no?


Does anyone know how it compares to NightwatchJs?

Br


Off-topic, but our freemium website is under attack by headless browsers.

The freemium service provides access to compute-heavy machine learning models running on GPUs.

Hackers blast 50-100 requests in the same second, which clog the servers and block legitimate users.

We reported IPs to AWS and use Cloudflare "Super Bot Fight Mode" to thwart attacks, but the hackers still break through.

We don't require accounts, but could impose account requirements if this helps.

Any suggestions?


Browser automation will occur by executing events in the DOM or by calling properties of the page/window. It’s all JavaScript designed for user interaction executed by a bot.

The one event that cannot be automated is cursor movement/position. Put a check into your event handlers that check that the cursor is actually over the event target.


You are right every testing solutions out there push UIEvents to the page rather than clicking with an actual mouse. That's why puppeteer, selenium etc are scraping tools not testing tools


That sounds like an accessibility problem.


Use an alternate control for keyboard navigation that is visually hidden and is accessed only by tab focus.


This is interesting. Thanks for sharing.

Are you saying block form submission unless the cursor is over the event target?

If so:

* How to handle legitimate requests from mobile users?

* How to handle form submissions with the "return" key?


Mobile users will use touch events instead of click events and likely your interface will be different and the screen width will be different. Check for these things along with keywords from the user agent string to determine mobile users from other users.

Return key on a control in a form will fire a submit event. Check for cursor position in your submit handler.


Just block the AWS ASN on CF, it's nor worth fighting.


+1 and GCP, and many other hosting ASNs


Tell freemium users what is the acceptable rate for requests per second. Publish the allowable rate on the website. Ban freemium user IPs that exceed the allowable rate. This can be done using a proxy.


A proxy like Cloudflare or a custom proxy that stores data?

Are there proxy examples you could point us to?

Thanks for your help.



Thank you for this.


100 requests/second isn't that much, especially if you're fronting your website with Cloudflare. Do you have some unauthenticated endpoint(s) that eat up a ton of server CPU?


Thanks for the reply!

The freemium service provides access to machine learning models on GPU instances, served with FastAPI.

Each request invokes a compute-intensive ML model, but perhaps there is something wrong with the FastAPI configuration as well?


It could be.

I watch the FastAPI repos a lot and tones of people do not understand how async python works and put their models with sync code in an async context.


Consider us one. :)

We tried removing "async" -- thinking it would force sequential processing -- but it unexpectedly seemed to cause parallel processing of requests, which caused CUDA memory errors.

Before removing "async", this is the weird behavior we observed:

* Hacker blasts 50-100 requests.

* Our ML model processes each request in normal time and sequentially.

* But instead of returning individual responses immediately, the server holds onto all responses -- sending responses only when the last request finishes (or a bunch of requests finish).

* Normally, request 1 should return in N seconds, request 2 in 2N seconds, but with this, all requests returned in about N50 seconds (assuming batch size of 50).

1. Any suggestions on this?

2. Mind clarifying how sync vs aync works? The FastAPI docs are unclear.

Any help would be much appreciated.

This has been extremely frustrating.


Any chance the entire thing can be offloaded to a task queue (Celery/etc)? This would decouple the HTTP request processing from the actual ML task.

The memory errors you're seeing could suggest that you may not actually be able to run multiple instances of the model, and even if you could it may not actually give you more performance than processing sequentially.

Seems like ultimately your current design can't gracefully handle too many concurrent requests, legitimate or malicious - this is a problem I recommend you address regardless of whether you manage to ban the malicious users.


Yeah this is the way.

@headlessvictim2 search for "Asynchronous Request-Reply pattern" if you want more information about this kind of architecture. You will remove any bottleneck from the API server and can easily scale out from the task queue.


Thanks for the suggestion.

How would this work with GPU-bound machine learning models?

The model processing takes > 30 seconds and would still represent the bottleneck?


You would still have the same bottleneck but the API request would return straight away with some sort of correllation ID. Then the workers that handle the GPU bound tasks would pull jobs when they are ready. If you get a lot of jobs all that will happen is the queue will fill up and the clients will wait longer and hit the status endpoint a few more times.

Here is an example of what it could look like: https://docs.microsoft.com/en-us/azure/architecture/patterns...


Thanks for the explanation.

Right now, we use ELB (Elastic Load Balancer) to sit in front of multiple GPU instances.

Is this sufficient or do you suggest adding Celery into this architecture?


Python async is co-operative multi-tasking (as opposed to per-emptive)

There is an event loop that goes through all the tasks and runs them.

The issue is the event loop can only move on to the next task when you reach an await. So if you run a lot of code (say an ML model) between awaits no other task can advance during this time.

This is why it is co-operative, it is up to a task to release the event loop, by hitting an await, so other tasks can get work done.

This is fine when you have async libs that often hit awaits at things that are IO related like say db, or http calls.

FastAPI will spawn controllers that are not defined as async functions on a thread pool but it is still a python so GIL and all that.

You should do as the sibling comment says and decouple your http from your ML and feed the ML with something like Celery. This way your server is always there to respond to things (even if just a 429) to hit a cache or whatever else.


Thanks for the FastAPI explanation. This makes sense.

Right now, we use ELB (Elastic Load Balancer) to sit in front of multiple GPU instances.

Is this sufficient or do you suggest adding Celery to this architecture?


What are the bots goals? Curious


To use premium settings without paying.

It appears less like malicious DDoS and more like pragmatic theft.


Could you add a symbolic, one-time fee to the "free" tier to deter multiple accounts and then implement reasonable rate-limits per-account?


what's your site? would like to play with it


Why not ReCAPTCHA?


Thanks for the suggestion.

It is possible, but this degrades the experience for legitimate users.

We prefer solving this without impacting/taxing normal users if possible.


Just add the captcha only for requests coming from the problematic ASNs, like AWS.

edit: Actually, since you use CF, just make a firewall rule that forces the captcha for those ASNs before it even gets to your app. They have a field named "ip.geoip.asnum" for that, and an action called "challenge" which will force a captcha.


Perhaps Captcha?


Thanks for the suggestion.

It is possible, but this degrades the experience for legitimate users.

We prefer solving this without impacting/taxing normal users if possible.


Recaptcha v3 doesn’t prompt if it thinks you’re a real user.


This could have major GDPR implications if that's something the parent cares about. ReCaptcha is basically Google spyware that happens to provide captcha services.


it's obvious the parent is trolling.


429




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: