Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Would you be interested in POM generator for Selenium?
20 points by mesadb on June 11, 2022 | hide | past | favorite | 34 comments
I have some plans to implement a POM and script generator for Selenium which includes multiples selectors and uses some AI selectors.

It will work this way: - Go to the page you want the POM for - Open the chrome extension - Click on generate POM - We will generate POM - You can add/remove some sections

Some other benefits might come with it is Email and PDF testing for Selenium.

Curious what you think about this?




In my (admittedly limited) experience, the difficulty of working with Selenium doesn't stem so much from finding appropriate selectors but from britleness: network delays/timeouts, objects not yet being yet lodaded into the DOM, dynamically generated classnames (think scraping obfuscation).

I do see a niche for an system (AI or not) that deals with that last case (i.e. it automatically grabs some correlated selectors to fallback to), or as a quick tool to scrape static sites.

However (and this may be out of your proposed scope, and that's fine) I see more value on something that can construct a sort of DOM timeline - in order to accurately know what is available and when. If you start "recording" network and user events since page load, you may be able to reconstruct a) which nodes the user interacted with b) which preconditions there are for these nodes to be consistently available, no matter which stochastic delays/errors are present.

This is tricky and time consuming even when done manually, so I'm not sure it can be AI-mplemented. But that's maybe an idea to explore down the line :)


Regarding automatic selectors: check out my open source library https://github.com/mherrmann/selenium-python-helium.


This is actually very interesting. I like how it's done. Very great job.

We also do the same thing but we also understand the box (context) then we can say. We should be on Sign in box and a button called Sign in should be present. If we are on a different page and we cannot find something similar to it, we will give better error messages. Like: we should have been on Sign in page but we are on sign up. etc.


The DOM timeline you describe would be pretty straightforward to create using a MutationObserver [0]. It's available in all major browsers.

[0] https://developer.mozilla.org/en-US/docs/Web/API/MutationObs...


We actually do observe the DOM and we should expect new objects on the page.

We do this on a different side of our product which we try to come up with test coverage from how your users use your app.



Why would someone not spell that out in the title, I wonder.


I'm wondering why people would want the Project Object Model for Java.


Sorry, I didn't want to make the title too long.

I think there still too many companies using Java and Selenium. No?


For selenium? No absolutely not in fact I would happily pay someone to not add to the horror that is selenium. I would however be interested in a POM generator for Cypress or Web Driver.


So is really Cypress/Puppeteer/Playwright much more stable than Selenium.

I still think any E2E test tooling will have the same issues and be simply horrible.


Yes, way more stable. At least cypress, I don't have much experience with playwright. We never had an issue regarding brittleness or stability. We do, however, get sometimes frustrated over certain API choices, like the way sessions and cookies are managed and how hard it is to keep a value throughout a test suite etc.


Quite literally every Selenium test I’ve had access to has been a total disaster. When your build pipelines fail 50/50 because of some bullshit Selenium timeout no one can reproduce that resolves itself the next run, it’s time to look at another tool. Not to mention that the Selenium port for .NET is a literal verbatim translation from Java and written by people clearly not versed in C#. The API and way the library works are nothing like how you’d expect a C# library to work.

For example class properties are not meant to do expensive work and yet they directly interface with the IPC to the browser. This means debugging in Visual Studio becomes impossible in any meaningful way. Hovering over a property while paused on a breakpoint will trigger some command in the browser that might throw an exception which causes an exception in the debugger.

Playwright, Cypress, Web Driver are simply what should be considered the defaults in 2022.


100% agree but I still think there are still people using Selenium. I think we could help people navigate to Cypress or Playwright or no-code solutions


Hi, we actually have done this for Cypress. I'd love to get your feedback on it: https://cypress.preflight.com/


As an SDET, I'm always looking for quicker/easier ways to get selectors that are non-brittle. The trick is to analyze either the CSS or XPath selector and optimize it so it's neither brittle nor a mile long. This is something I've learned to do by hand, but optimizing as I've indicated doesn't seem to be a part of the tools I've seen.

Also of note: Selenium was really designed for "Web 1.x" and doesn't assume the dynamic DOMs of today. Modern frameworks like Playwright are built with the understanding that the DOM is dynamic and are more robust. So - consider Playwright (or something like it - although I think it's "best of breed") vs. Selenium.


A long time ago, I worked on my Page Object Generator pet project, a desktop app allowing me to record/test page elements via Selenium Webdriver. https://github.com/dzharii/swd-recorder

It was a time when I had to test an IE6-only enterprise web tool, And there was no option to copy CSS-selector in Chrome/Firefox. I am not working in UI automation anymore (sad :( ) and lost interest in this project, but I think having a tool to record page objects and generate some smart automation code can still be quite helpful.


That sounds awful I'm sorry :D

I'll drop you an email when we are able to launch it. Would love to get your feedback


I would absolutely pay for this, but only if it could work with web applications built in React. React generates CSS class/ID names, some of which are duplicates. This requires determining convoluted XPATHs which can very easily, and often do, change in ways that immediately break the pre-defined locators. I’m aware of using unique attributes to make elements easier to locate, but that often requires someone else’s buy-in which isn’t always a guarantee. Even having a way to run the application in a headless browser and generating the POM programmatically would be great.


Given that I don’t know what this is, and that Selenium’s API is well-known to promote flakey tests (it’s flakey by default and needs layers of abstractions to add waits, delays, etc), I’d stay away.


Aren't all E2E test frameworks flaky by default? When you're testing the end user experience that goes through multiple layers of services, I think that should be expected.

Would love to learn from your experiences on any better frameworks you might have used, though!


Yeah, Selenium is pretty flakey. I've been pretty happy with https://playwright.dev . Kind of the successor of puppeteer. At least, some of the same devs went from Google to MS. Here's an example where MS has better cross-browswer support than Google.


I second this. I think playwright and cypress are the main ones you should use. We already implemented something for Cypress and looking to do the same thing for Playwright. It'll be pretty similar to this: cypress.preflight.com

Would love to get your feedback so we can add it :)


No not all e2e frameworks are flakey. Cypress allows you to intercept network calls with ease. The same can be done in Playwright as well.


We’re using Hurl [1] at work for integration tests with very good success. We’ve eliminated false positive or flacky tests: it’s a simple tool that runs HTTP requests and you can add asserts on responses.

It’s as if you would test your app with curl, very fast and reliable. On the other hand, contrary to Selenium, there is no Javascript engine so you can only test the “raw” DOM or json response sent by the network (and not a DOM managed and rendered by a Javascript front end framework).

(Disclaimer: I’m one of Hurl maintainer)

[1]: https://hurl.dev


Seems hurl is quite a different thing than browser automation though?

API/request level testing is great when appropriate, but testing what happens in browser won't be possible without using something like selenium, nightwatch or playwright


Yes totally.

On the other hand, you can also easily test use cases where the browser is “helping” you (for instance, you want to test that your backend didn’t accept an invalid email, but your HTML form has HTML5 validation that prevents a user to enter an invalid email). Or you want to test HttpsOnly cookie attributes. But it can’t do UI integration tests.


Cypress completely solved flakey UI tests for me. I used Selenium for years but would never go back now.


Switched from Cypress to Selenium a while ago. It is noticeably quicker to respond to network request waits. Additionally, I prefer Playwright’s locator abstractions to Cypress’ queue/alias system.


I really like playwright's locators as well. It looks pretty cool. We actually have a Cypress POM generator tool which uses AI and let's you test emails etc: https://cypress.preflight.com/


Sounds somewhat similar to Dakka: https://www.dakka.dev/


I have heard of this actually. It's a nice open source tool. But it doesn't have AI selectors no?


Get DataDog. Launch their synthetic ux monitor.

Implement that for scraping and that would be great I think.


Can I get your feedback on preflight.com vs Datadog?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: